Set up TimescaleDB high availability with streaming replication and automatic failover

Advanced 45 min Apr 22, 2026
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Configure TimescaleDB with PostgreSQL streaming replication for high availability. Set up primary and standby servers with hot standby mode, implement automatic failover with pg_auto_failover, and monitor replication status for production-ready time-series database clustering.

Prerequisites

  • At least 2 servers with 4GB RAM
  • PostgreSQL 16 compatible system
  • Network connectivity between servers
  • Basic knowledge of PostgreSQL administration

What this solves

TimescaleDB high availability ensures your time-series database remains operational during server failures, hardware issues, or maintenance windows. This tutorial sets up PostgreSQL streaming replication with TimescaleDB extensions, configures automatic failover using pg_auto_failover, and establishes monitoring for replication lag and cluster health.

Step-by-step installation

Update system packages

Start by updating your package manager on both primary and standby servers to ensure you get the latest versions.

sudo apt update && sudo apt upgrade -y
sudo dnf update -y

Install PostgreSQL and TimescaleDB

Install PostgreSQL 16 and TimescaleDB on both primary and standby servers. TimescaleDB provides time-series optimizations on top of PostgreSQL.

wget --quiet -O - https://packagecloud.io/timescale/timescaledb/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/timescaledb.gpg
echo "deb [signed-by=/usr/share/keyrings/timescaledb.gpg] https://packagecloud.io/timescale/timescaledb/ubuntu/ $(lsb_release -c -s) main" | sudo tee /etc/apt/sources.list.d/timescaledb.list
sudo apt update
sudo apt install -y postgresql-16 postgresql-client-16 timescaledb-2-postgresql-16
sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-x86_64/pgdg-redhat-repo-latest.noarch.rpm
echo "[timescale_timescaledb]
name=timescale_timescaledb
baseurl=https://packagecloud.io/timescale/timescaledb/el/\$releasever/\$basearch
repo_gpgcheck=1
gpgcheck=0
enabled=1
gpgkey=https://packagecloud.io/timescale/timescaledb/gpgkey
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
metadata_expire=300" | sudo tee /etc/yum.repos.d/timescale_timescaledb.repo
sudo dnf install -y postgresql16-server postgresql16 timescaledb-2-postgresql-16

Initialize PostgreSQL database (AlmaLinux/Rocky only)

On AlmaLinux and Rocky Linux systems, manually initialize the PostgreSQL database cluster.

# Database is automatically initialized on Ubuntu/Debian
echo "PostgreSQL automatically initialized"
sudo /usr/pgsql-16/bin/postgresql-16-setup initdb
sudo systemctl enable postgresql-16
sudo systemctl start postgresql-16

Configure TimescaleDB extension

Add TimescaleDB to PostgreSQL's shared libraries and tune the configuration for time-series workloads.

sudo timescaledb-tune --quiet --yes

Configure primary server for replication

On the primary server, configure PostgreSQL for streaming replication by enabling WAL archiving and setting replication parameters.

# Replication settings
listen_addresses = '*'
wal_level = replica
max_wal_senders = 3
max_replication_slots = 3
wal_keep_size = 1GB
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/16/main/archive/%f'

TimescaleDB settings

shared_preload_libraries = 'timescaledb' max_connections = 200 shared_buffers = 256MB effective_cache_size = 1GB work_mem = 4MB maintenance_work_mem = 64MB

Create archive directory

Create the WAL archive directory with proper permissions for the postgres user.

sudo mkdir -p /var/lib/postgresql/16/main/archive
sudo chown postgres:postgres /var/lib/postgresql/16/main/archive
sudo chmod 750 /var/lib/postgresql/16/main/archive

Configure authentication for replication

Set up host-based authentication to allow the standby server to connect for replication.

# Add these lines (replace 203.0.113.20 with your standby server IP)
host    replication     replicator      203.0.113.20/32         scram-sha-256
host    all             postgres        203.0.113.20/32         scram-sha-256

Create replication user

Create a dedicated user for replication with appropriate privileges.

sudo systemctl restart postgresql
sudo -u postgres psql -c "CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'SecureReplicationPass123';"
sudo -u postgres psql -c "CREATE EXTENSION IF NOT EXISTS timescaledb;"

Test primary server setup

Verify that TimescaleDB is properly installed and the primary server is ready for replication.

sudo -u postgres psql -c "SELECT version();"
sudo -u postgres psql -c "\dx timescaledb"
sudo -u postgres psql -c "SELECT * FROM pg_stat_replication;"

Create base backup for standby

On the standby server, create a base backup from the primary server to initialize the replica.

sudo systemctl stop postgresql
sudo -u postgres rm -rf /var/lib/postgresql/16/main/*
sudo -u postgres pg_basebackup -h 203.0.113.10 -D /var/lib/postgresql/16/main -U replicator -W -v -P -R

Configure standby server

Configure the standby server for hot standby mode and streaming replication.

# Hot standby settings
hot_standby = on
hot_standby_feedback = on
wal_receiver_timeout = 60s
max_standby_streaming_delay = 30s
max_standby_archive_delay = 30s

TimescaleDB settings (same as primary)

shared_preload_libraries = 'timescaledb' max_connections = 200 shared_buffers = 256MB effective_cache_size = 1GB

Start standby server

Start the standby server and verify it connects to the primary for streaming replication.

sudo systemctl start postgresql
sudo systemctl enable postgresql
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"
sudo -u postgres psql -c "SELECT * FROM pg_stat_wal_receiver;"

Install pg_auto_failover

Install pg_auto_failover for automatic failover management on both servers.

sudo apt install -y postgresql-16-auto-failover
sudo dnf install -y pg_auto_failover_16

Set up monitor node

Initialize the pg_auto_failover monitor on a separate server or the primary server.

export PGDATA=/var/lib/postgresql/monitor
sudo -u postgres mkdir -p /var/lib/postgresql/monitor
sudo -u postgres pg_autoctl create monitor --hostname 203.0.113.10 --pgdata /var/lib/postgresql/monitor

Register primary server with monitor

Register the primary TimescaleDB server with the pg_auto_failover monitor.

export PGDATA=/var/lib/postgresql/16/main
sudo -u postgres pg_autoctl create postgres --hostname 203.0.113.10 --pgdata /var/lib/postgresql/16/main --monitor 'postgres://autoctl_node@203.0.113.10:5432/pg_auto_failover'

Register standby server with monitor

Register the standby TimescaleDB server with the monitor for automatic failover.

export PGDATA=/var/lib/postgresql/16/main
sudo -u postgres pg_autoctl create postgres --hostname 203.0.113.20 --pgdata /var/lib/postgresql/16/main --monitor 'postgres://autoctl_node@203.0.113.10:5432/pg_auto_failover'

Create systemd services for pg_auto_failover

Create systemd services to manage pg_auto_failover automatically.

[Unit]
Description=pg_auto_failover
Requires=postgresql.service
After=postgresql.service
Requisite=postgresql.service

[Service]
Type=exec
User=postgres
Group=postgres
ExecStart=/usr/lib/postgresql/16/bin/pg_autoctl run
Environment=PGDATA=/var/lib/postgresql/16/main
Restart=always
RestartSec=5
TimeoutStartSec=0

[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl enable pgautofailover
sudo systemctl start pgautofailover

Configure connection pooling

Set up connection pooling to handle failover transparently for applications.

sudo apt install -y pgbouncer
sudo dnf install -y pgbouncer
[databases]
timescaledb = host=203.0.113.10 port=5432 dbname=postgres

[pgbouncer]
listen_port = 6432
listen_addr = *
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 200
default_pool_size = 20

Step-by-step configuration

Create monitoring database

Create a dedicated database for monitoring TimescaleDB replication metrics.

sudo -u postgres createdb tsdb_monitor
sudo -u postgres psql -d tsdb_monitor -c "CREATE EXTENSION IF NOT EXISTS timescaledb;"
sudo -u postgres psql -d tsdb_monitor -c "CREATE TABLE replication_lag (time TIMESTAMPTZ NOT NULL, lag_seconds REAL, lag_bytes BIGINT);"
sudo -u postgres psql -d tsdb_monitor -c "SELECT create_hypertable('replication_lag', 'time');"

Set up replication monitoring

Create a script to monitor replication lag and store metrics in TimescaleDB.

#!/bin/bash
set -e

Get replication lag from primary server

LAG_SECONDS=$(sudo -u postgres psql -t -c "SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()))::int;" 2>/dev/null || echo "0") LAG_BYTES=$(sudo -u postgres psql -t -c "SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) FROM pg_stat_replication WHERE application_name = 'walreceiver';" 2>/dev/null || echo "0")

Insert metrics into monitoring table

sudo -u postgres psql -d tsdb_monitor -c "INSERT INTO replication_lag (time, lag_seconds, lag_bytes) VALUES (NOW(), $LAG_SECONDS, $LAG_BYTES);" echo "Replication lag: ${LAG_SECONDS}s, ${LAG_BYTES} bytes"
sudo chmod +x /usr/local/bin/monitor-replication.sh

Configure automated monitoring

Set up a cron job to collect replication metrics every minute.

echo "    * /usr/local/bin/monitor-replication.sh >> /var/log/replication-monitor.log 2>&1" | sudo -u postgres crontab -

Create sample time-series data

Create a sample hypertable to test TimescaleDB replication functionality.

sudo -u postgres psql -d postgres -c "CREATE TABLE sensor_data (time TIMESTAMPTZ NOT NULL, sensor_id INT, temperature REAL, humidity REAL);"
sudo -u postgres psql -d postgres -c "SELECT create_hypertable('sensor_data', 'time');"
sudo -u postgres psql -d postgres -c "INSERT INTO sensor_data VALUES (NOW(), 1, 22.5, 45.2), (NOW(), 2, 23.1, 44.8);"

Verify your setup

Test the TimescaleDB high availability setup with these verification commands.

# Check PostgreSQL and TimescaleDB status
sudo -u postgres psql -c "SELECT version();"
sudo -u postgres psql -c "\dx timescaledb"

Check replication status

sudo -u postgres psql -c "SELECT * FROM pg_stat_replication;" sudo -u postgres psql -c "SELECT * FROM pg_replication_slots;"

Verify pg_auto_failover status

sudo -u postgres pg_autoctl show state sudo -u postgres pg_autoctl show events
# Verify standby server is in recovery mode
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"

Check WAL receiver status

sudo -u postgres psql -c "SELECT * FROM pg_stat_wal_receiver;"

Test read-only access to replicated data

sudo -u postgres psql -c "SELECT COUNT(*) FROM sensor_data;"
# Simulate primary server failure
sudo systemctl stop postgresql

Verify automatic failover (run on former standby)

sudo -u postgres pg_autoctl show state

Test write access on new primary

sudo -u postgres psql -c "INSERT INTO sensor_data VALUES (NOW(), 3, 21.8, 46.5);"

Monitor replication status and performance

Set up alerting for replication lag

Create a script to alert when replication lag exceeds thresholds.

#!/bin/bash
set -e

MAX_LAG_SECONDS=30
MAX_LAG_BYTES=16777216  # 16MB

Get current replication lag

LAG_SECONDS=$(sudo -u postgres psql -t -c "SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()))::int;" 2>/dev/null || echo "999") LAG_BYTES=$(sudo -u postgres psql -t -c "SELECT pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) FROM pg_stat_replication WHERE application_name = 'walreceiver';" 2>/dev/null || echo "999999999")

Check thresholds and alert

if [[ $LAG_SECONDS -gt $MAX_LAG_SECONDS ]] || [[ $LAG_BYTES -gt $MAX_LAG_BYTES ]]; then echo "ALERT: Replication lag exceeded thresholds - ${LAG_SECONDS}s, ${LAG_BYTES} bytes" | logger -t timescaledb-replication # Add email/Slack notification here fi echo "Replication health check: OK (lag: ${LAG_SECONDS}s, ${LAG_BYTES} bytes)"
sudo chmod +x /usr/local/bin/check-replication-health.sh
echo "/5    * /usr/local/bin/check-replication-health.sh" | sudo crontab -

Monitor cluster with pg_auto_failover

Use pg_auto_failover commands to monitor cluster health and perform maintenance.

# Show current cluster state
sudo -u postgres pg_autoctl show state

Show cluster events and history

sudo -u postgres pg_autoctl show events --count 10

Show cluster configuration

sudo -u postgres pg_autoctl config show

Enable maintenance mode for updates

sudo -u postgres pg_autoctl enable maintenance

Test failover scenarios and recovery procedures

Test automatic failover

Simulate primary server failure and verify automatic promotion of standby.

# Stop primary PostgreSQL to trigger failover
sudo systemctl stop postgresql
sudo systemctl stop pgautofailover
# Watch failover process
sudo -u postgres pg_autoctl watch

Verify new primary status (run on former standby)

sudo -u postgres psql -c "SELECT pg_is_in_recovery();" sudo -u postgres psql -c "INSERT INTO sensor_data VALUES (NOW(), 4, 20.5, 47.2);"

Test manual failover

Perform planned failover for maintenance scenarios.

# Enable maintenance mode on primary
sudo -u postgres pg_autoctl enable maintenance

Perform manual failover

sudo -u postgres pg_autoctl perform failover

Verify cluster state after failover

sudo -u postgres pg_autoctl show state

Test recovery procedures

Restore the original primary server and add it back to the cluster.

# Restart services after fixing issues
sudo systemctl start postgresql
sudo systemctl start pgautofailover

Verify node rejoins as standby

sudo -u postgres pg_autoctl show state sudo -u postgres psql -c "SELECT pg_is_in_recovery();"

Test data consistency

sudo -u postgres psql -c "SELECT COUNT(*) FROM sensor_data;"

Create backup and recovery strategy

Set up automated backups that work with your high availability setup.

#!/bin/bash
set -e

BACKUP_DIR="/var/backups/timescaledb"
DATE=$(date +%Y%m%d_%H%M%S)

Create backup directory

sudo mkdir -p $BACKUP_DIR

Create logical backup

sudo -u postgres pg_dumpall --clean --if-exists > $BACKUP_DIR/full_backup_$DATE.sql

Create WAL archive backup

sudo tar -czf $BACKUP_DIR/wal_archive_$DATE.tar.gz /var/lib/postgresql/16/main/archive/

Cleanup old backups (keep 7 days)

find $BACKUP_DIR -name "*.sql" -mtime +7 -delete find $BACKUP_DIR -name "*.tar.gz" -mtime +7 -delete echo "Backup completed: $BACKUP_DIR/full_backup_$DATE.sql"
sudo chmod +x /usr/local/bin/timescaledb-backup.sh
echo "0 2   * /usr/local/bin/timescaledb-backup.sh >> /var/log/timescaledb-backup.log 2>&1" | sudo crontab -

Common issues

SymptomCauseFix
Standby server not connectingFirewall blocking port 5432sudo ufw allow 5432 or configure iptables
Authentication failure for replicatorIncorrect pg_hba.conf configurationCheck host entry and restart PostgreSQL
High replication lagNetwork latency or disk I/O bottleneckCheck network connectivity and disk performance
pg_auto_failover not promoting standbyMonitor node cannot reach serversCheck monitor node connectivity and logs
TimescaleDB extension missing after failoverExtension not enabled on standbyCREATE EXTENSION timescaledb; after promotion
WAL archive directory fullArchive cleanup not configuredConfigure archive_cleanup_command or implement cleanup script

Next steps

Running this in production?

Want this handled for you? Running this at scale adds a second layer of work: capacity planning, failover drills, cost control, and on-call. See how we run infrastructure like this for European teams.

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle high availability infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.