Set up TimescaleDB clustering for high availability with automatic failover

Advanced 45 min Apr 02, 2026 307 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Configure a production-ready TimescaleDB cluster with streaming replication, automatic failover using Patroni, and etcd for distributed consensus to ensure zero-downtime operation of your time-series database.

Prerequisites

  • Three servers with at least 4GB RAM each
  • Network connectivity between nodes
  • Root or sudo access on all nodes

What this solves

TimescaleDB clustering provides high availability for time-series workloads by eliminating single points of failure. This tutorial sets up a three-node TimescaleDB cluster with automatic failover using Patroni and etcd, ensuring your time-series database remains available even if the primary node fails.

Step-by-step installation

Update system packages

Start by updating your package manager on all three nodes to ensure you get the latest versions.

sudo apt update && sudo apt upgrade -y
sudo dnf update -y

Install PostgreSQL and TimescaleDB

Install PostgreSQL 15 with the TimescaleDB extension on all three nodes.

sudo sh -c "echo 'deb https://packagecloud.io/timescale/timescaledb/ubuntu/ $(lsb_release -c -s) main' > /etc/apt/sources.list.d/timescaledb.list"
wget --quiet -O - https://packagecloud.io/timescale/timescaledb/gpgkey | sudo apt-key add -
sudo apt update
sudo apt install -y postgresql-15 timescaledb-2-postgresql-15
sudo tee /etc/yum.repos.d/timescale_timescaledb.repo <

Install etcd cluster

Install and configure etcd on all three nodes for distributed consensus. Replace the IP addresses with your actual node IPs.

sudo apt install -y etcd
sudo dnf install -y etcd

Configure etcd on first node

Configure etcd on the first node (203.0.113.10). This node will bootstrap the cluster.

ETCD_NAME="etcd1"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://203.0.113.10:2380"
ETCD_LISTEN_CLIENT_URLS="http://203.0.113.10:2379,http://127.0.0.1:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://203.0.113.10:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://203.0.113.10:2379"
ETCD_INITIAL_CLUSTER="etcd1=http://203.0.113.10:2380,etcd2=http://203.0.113.11:2380,etcd3=http://203.0.113.12:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="timescale-cluster"

Configure etcd on second node

Configure etcd on the second node (203.0.113.11) with the same cluster settings.

ETCD_NAME="etcd2"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://203.0.113.11:2380"
ETCD_LISTEN_CLIENT_URLS="http://203.0.113.11:2379,http://127.0.0.1:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://203.0.113.11:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://203.0.113.11:2379"
ETCD_INITIAL_CLUSTER="etcd1=http://203.0.113.10:2380,etcd2=http://203.0.113.11:2380,etcd3=http://203.0.113.12:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="timescale-cluster"

Configure etcd on third node

Configure etcd on the third node (203.0.113.12) to complete the cluster.

ETCD_NAME="etcd3"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="http://203.0.113.12:2380"
ETCD_LISTEN_CLIENT_URLS="http://203.0.113.12:2379,http://127.0.0.1:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://203.0.113.12:2380"
ETCD_ADVERTISE_CLIENT_URLS="http://203.0.113.12:2379"
ETCD_INITIAL_CLUSTER="etcd1=http://203.0.113.10:2380,etcd2=http://203.0.113.11:2380,etcd3=http://203.0.113.12:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="timescale-cluster"

Start etcd cluster

Start etcd on all three nodes simultaneously to form the cluster.

sudo systemctl enable --now etcd
sudo systemctl status etcd

Install Patroni

Install Patroni on all three nodes to manage PostgreSQL with automatic failover.

sudo apt install -y python3-pip python3-psycopg2
sudo pip3 install patroni python-etcd
sudo dnf install -y python3-pip python3-psycopg2
sudo pip3 install patroni python-etcd

Create PostgreSQL user and directories

Create the postgres user and required directories on all nodes. This user will manage the database processes.

sudo useradd -m postgres
sudo mkdir -p /data/postgresql
sudo chown postgres:postgres /data/postgresql
sudo chmod 700 /data/postgresql

Configure Patroni on first node

Create the Patroni configuration for the first node. This will become the initial primary.

scope: timescale-cluster
namespace: /db/
name: node1

restapi:
  listen: 203.0.113.10:8008
  connect_address: 203.0.113.10:8008

etcd:
  hosts: 203.0.113.10:2379,203.0.113.11:2379,203.0.113.12:2379

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 30
    maximum_lag_on_failover: 1048576
    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:
        wal_level: replica
        hot_standby: "on"
        max_connections: 100
        max_worker_processes: 8
        wal_keep_segments: 8
        max_wal_senders: 10
        max_replication_slots: 10
        max_prepared_transactions: 0
        max_locks_per_transaction: 64
        wal_log_hints: "on"
        track_commit_timestamp: "off"
        archive_mode: "on"
        archive_timeout: 1800s
        archive_command: mkdir -p ../wal_archive && test ! -f ../wal_archive/%f && cp %p ../wal_archive/%f
        shared_preload_libraries: 'timescaledb'
      recovery_conf:
        restore_command: cp ../wal_archive/%f %p

  initdb:
  - encoding: UTF8
  - data-checksums

  pg_hba:
  - host replication replicator 127.0.0.1/32 md5
  - host replication replicator 203.0.113.10/32 md5
  - host replication replicator 203.0.113.11/32 md5
  - host replication replicator 203.0.113.12/32 md5
  - host all all 0.0.0.0/0 md5

  users:
    admin:
      password: StrongAdminPassword123!
      options:
        - createrole
        - createdb

postgresql:
  listen: 203.0.113.10:5432
  connect_address: 203.0.113.10:5432
  data_dir: /data/postgresql
  bin_dir: /usr/lib/postgresql/15/bin
  pgpass: /tmp/pgpass
  authentication:
    replication:
      username: replicator
      password: ReplicatorPassword123!
    superuser:
      username: postgres
      password: PostgresPassword123!
    rewind:
      username: rewind_user
      password: RewindPassword123!
  parameters:
    unix_socket_directories: '.'

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

Configure Patroni on second node

Create the Patroni configuration for the second node with its specific IP address.

scope: timescale-cluster
namespace: /db/
name: node2

restapi:
  listen: 203.0.113.11:8008
  connect_address: 203.0.113.11:8008

etcd:
  hosts: 203.0.113.10:2379,203.0.113.11:2379,203.0.113.12:2379

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 30
    maximum_lag_on_failover: 1048576
    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:
        wal_level: replica
        hot_standby: "on"
        max_connections: 100
        max_worker_processes: 8
        wal_keep_segments: 8
        max_wal_senders: 10
        max_replication_slots: 10
        max_prepared_transactions: 0
        max_locks_per_transaction: 64
        wal_log_hints: "on"
        track_commit_timestamp: "off"
        archive_mode: "on"
        archive_timeout: 1800s
        archive_command: mkdir -p ../wal_archive && test ! -f ../wal_archive/%f && cp %p ../wal_archive/%f
        shared_preload_libraries: 'timescaledb'
      recovery_conf:
        restore_command: cp ../wal_archive/%f %p

postgresql:
  listen: 203.0.113.11:5432
  connect_address: 203.0.113.11:5432
  data_dir: /data/postgresql
  bin_dir: /usr/lib/postgresql/15/bin
  pgpass: /tmp/pgpass
  authentication:
    replication:
      username: replicator
      password: ReplicatorPassword123!
    superuser:
      username: postgres
      password: PostgresPassword123!
    rewind:
      username: rewind_user
      password: RewindPassword123!
  parameters:
    unix_socket_directories: '.'

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

Configure Patroni on third node

Create the Patroni configuration for the third node to complete the cluster setup.

scope: timescale-cluster
namespace: /db/
name: node3

restapi:
  listen: 203.0.113.12:8008
  connect_address: 203.0.113.12:8008

etcd:
  hosts: 203.0.113.10:2379,203.0.113.11:2379,203.0.113.12:2379

bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 30
    maximum_lag_on_failover: 1048576
    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:
        wal_level: replica
        hot_standby: "on"
        max_connections: 100
        max_worker_processes: 8
        wal_keep_segments: 8
        max_wal_senders: 10
        max_replication_slots: 10
        max_prepared_transactions: 0
        max_locks_per_transaction: 64
        wal_log_hints: "on"
        track_commit_timestamp: "off"
        archive_mode: "on"
        archive_timeout: 1800s
        archive_command: mkdir -p ../wal_archive && test ! -f ../wal_archive/%f && cp %p ../wal_archive/%f
        shared_preload_libraries: 'timescaledb'
      recovery_conf:
        restore_command: cp ../wal_archive/%f %p

postgresql:
  listen: 203.0.113.12:5432
  connect_address: 203.0.113.12:5432
  data_dir: /data/postgresql
  bin_dir: /usr/lib/postgresql/15/bin
  pgpass: /tmp/pgpass
  authentication:
    replication:
      username: replicator
      password: ReplicatorPassword123!
    superuser:
      username: postgres
      password: PostgresPassword123!
    rewind:
      username: rewind_user
      password: RewindPassword123!
  parameters:
    unix_socket_directories: '.'

tags:
    nofailover: false
    noloadbalance: false
    clonefrom: false
    nosync: false

Create Patroni systemd service

Create a systemd service file for Patroni on all three nodes to manage the service lifecycle.

[Unit]
Description=Runners to orchestrate a high-availability PostgreSQL
After=syslog.target network.target

[Service]
Type=simple
User=postgres
Group=postgres
ExecStart=/usr/local/bin/patroni /etc/patroni/patroni.yml
KillMode=process
TimeoutSec=30
Restart=no

[Install]
WantedBy=multi-user.target

Set correct permissions

Set the correct ownership and permissions for Patroni configuration files. The postgres user needs read access to the configuration.

Never use chmod 777. It gives every user on the system full access to your files. Instead, fix ownership with chown and use minimal permissions.
sudo mkdir -p /etc/patroni
sudo chown postgres:postgres /etc/patroni/patroni.yml
sudo chmod 600 /etc/patroni/patroni.yml
sudo systemctl daemon-reload

Start Patroni cluster

Start Patroni on the first node, wait for it to initialize, then start the other nodes.

sudo systemctl enable --now patroni
sudo systemctl status patroni

Enable TimescaleDB extension

Connect to the primary node and enable the TimescaleDB extension in your database.

sudo -u postgres psql -h 203.0.113.10 -p 5432
CREATE DATABASE timeseries;
\c timeseries
CREATE EXTENSION IF NOT EXISTS timescaledb;

Configure firewall rules

Open the required ports for PostgreSQL, etcd, and Patroni communication.

sudo ufw allow 5432/tcp
sudo ufw allow 8008/tcp
sudo ufw allow 2379/tcp
sudo ufw allow 2380/tcp
sudo firewall-cmd --permanent --add-port=5432/tcp
sudo firewall-cmd --permanent --add-port=8008/tcp
sudo firewall-cmd --permanent --add-port=2379/tcp
sudo firewall-cmd --permanent --add-port=2380/tcp
sudo firewall-cmd --reload

Verify your setup

Check that your TimescaleDB cluster is running correctly and replication is working.

patronictl -c /etc/patroni/patroni.yml list
etcdctl cluster-health
sudo -u postgres psql -h 203.0.113.10 -c "SELECT * FROM pg_stat_replication;"
sudo -u postgres psql -h 203.0.113.10 -c "SELECT version();"
sudo -u postgres psql -h 203.0.113.10 -d timeseries -c "SELECT default_version, installed_version FROM pg_available_extensions WHERE name = 'timescaledb';"

Test high availability scenarios

Test automatic failover

Simulate a primary node failure to verify automatic failover works correctly.

sudo systemctl stop patroni
patronictl -c /etc/patroni/patroni.yml list
patronictl -c /etc/patroni/patroni.yml failover --master node1 --candidate node2

Monitor cluster status

Use Patroni's built-in monitoring to check cluster health and replication lag.

patronictl -c /etc/patroni/patroni.yml list
patronictl -c /etc/patroni/patroni.yml show-config
sudo -u postgres psql -h 203.0.113.11 -c "SELECT client_addr, state, sync_state FROM pg_stat_replication;"

Common issues

SymptomCauseFix
Patroni won't startetcd cluster not runningsudo systemctl restart etcd on all nodes
Replication lag highNetwork latency or disk I/OCheck pg_stat_replication and optimize disk performance
Split-brain conditionNetwork partitionCheck etcd quorum with etcdctl cluster-health
TimescaleDB extension missingNot loaded in shared_preload_librariesRestart PostgreSQL after adding to config
Connection refusedFirewall blocking portsOpen ports 5432, 8008, 2379, 2380

Next steps

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle high availability infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.