MariaDB Galera Cluster Multi-Master Setup

Set up a highly available MariaDB Galera cluster with multi-master replication and automatic failover for production database workloads. This tutorial covers cluster initialization, node configuration, and monitoring setup across multiple servers.

Prerequisites

At least 3 servers with 4GB RAM each
Network connectivity between cluster nodes
Root or sudo access on all nodes

What this solves

MariaDB Galera cluster provides synchronous multi-master replication for database high availability. This eliminates single points of failure by allowing read and write operations on any cluster node with automatic failover when nodes become unavailable.

Step-by-step installation

Prepare cluster nodes

You'll need at least 3 nodes for a production cluster. Update all systems and configure hostnames for cluster communication.

sudo apt update && sudo apt upgrade -y
sudo hostnamectl set-hostname galera-node1

sudo dnf update -y
sudo hostnamectl set-hostname galera-node1

Configure /etc/hosts on all nodes with cluster member IP addresses:

203.0.113.10  galera-node1
203.0.113.11  galera-node2
203.0.113.12  galera-node3

Install MariaDB Galera packages

Install MariaDB server and Galera cluster components on all nodes. The galera package provides the wsrep provider for synchronous replication.

sudo apt install -y mariadb-server galera-4 mariadb-client mariadb-backup rsync

sudo dnf install -y mariadb-server galera mariadb rsync

Stop MariaDB services

Stop MariaDB on all nodes before configuring the cluster. This prevents individual node startup during cluster configuration.

sudo systemctl stop mariadb
sudo systemctl disable mariadb

Configure Galera cluster settings

Create the Galera configuration file on all nodes. This defines cluster membership, replication settings, and wsrep provider options.

[galera]
wsrep_on = ON
wsrep_provider = /usr/lib/galera/libgalera_smm.so
wsrep_cluster_name = "galera_cluster"
wsrep_cluster_address = "gcomm://203.0.113.10,203.0.113.11,203.0.113.12"
wsrep_sst_method = rsync
wsrep_node_address = "203.0.113.10"
wsrep_node_name = "galera-node1"

binlog_format = row
default_storage_engine = InnoDB
innodb_autoinc_lock_mode = 2
innodb_doublewrite = 1
query_cache_size = 0
query_cache_type = 0

[mysqld]
bind-address = 0.0.0.0

Important: Change wsrep_node_address and wsrep_node_name for each node to match their respective IP addresses and hostnames.

Configure cluster authentication

Create a cluster user for state snapshot transfers (SST). This user enables nodes to synchronize data during startup and recovery.

[galera]
wsrep_sst_auth = sst_user:secure_sst_password

Initialize the bootstrap node

Start the first node in bootstrap mode to create the initial cluster. This establishes the primary component for other nodes to join.

sudo galera_new_cluster

Verify the bootstrap node started successfully:

sudo systemctl status mariadb

Secure MariaDB installation

Run the security script on the bootstrap node to set root password and remove test databases.

sudo mysql_secure_installation

Create cluster SST user

Connect to MariaDB and create the state snapshot transfer user referenced in your Galera configuration.

sudo mysql -u root -p

CREATE USER 'sst_user'@'localhost' IDENTIFIED BY 'secure_sst_password';
GRANT PROCESS, RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sst_user'@'localhost';
FLUSH PRIVILEGES;
EXIT;

Join additional nodes

Start MariaDB on the remaining nodes. They will automatically connect to the cluster and synchronize data from the bootstrap node.

sudo systemctl start mariadb
sudo systemctl enable mariadb

Enable automatic startup

Enable MariaDB service on the bootstrap node after successful cluster formation.

sudo systemctl enable mariadb

Configure cluster monitoring

Install cluster status monitoring

Set up cluster health monitoring to track node status and detect split-brain scenarios. This is crucial for production cluster management.

sudo mysql -u root -p -e "SHOW STATUS LIKE 'wsrep%'"

Create monitoring user

Create a dedicated user for monitoring tools and health checks with minimal required privileges.

CREATE USER 'monitoring'@'%' IDENTIFIED BY 'monitor_password';
GRANT PROCESS, REPLICATION CLIENT ON *.* TO 'monitoring'@'%';
FLUSH PRIVILEGES;

Configure cluster recovery

Set up automatic cluster recovery by configuring the safe_to_bootstrap flag. This prevents split-brain conditions during cluster restarts.

[Unit]
After=network.target

[Service]
ExecStartPre=/bin/sh -c "systemctl unset-environment _WSREP_START_POSITION"
ExecStartPre=/bin/sh -c "[ ! -e /var/lib/mysql/grastate.dat ] || systemctl set-environment _WSREP_START_POSITION=--wsrep_start_position=$(sudo -u mysql grep -E '^seqno:' /var/lib/mysql/grastate.dat | cut -d: -f2)"
ExecStart=
ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION
KillMode=process

sudo systemctl daemon-reload

Verify your setup

Check cluster status and node synchronization across all members:

# Check cluster size and status
sudo mysql -u root -p -e "SHOW STATUS LIKE 'wsrep_cluster_size';"
sudo mysql -u root -p -e "SHOW STATUS LIKE 'wsrep_local_state_comment';"

# Test write operations on different nodes
sudo mysql -u root -p -e "CREATE DATABASE test_cluster; USE test_cluster; CREATE TABLE test (id INT PRIMARY KEY, data VARCHAR(100)); INSERT INTO test VALUES (1, 'Node 1 write');"

# Verify replication on other nodes
sudo mysql -u root -p -e "USE test_cluster; SELECT * FROM test;"

Implement automatic failover

Configure application connection pooling

Set up connection pooling with automatic failover using ProxySQL or HAProxy. This ensures applications seamlessly handle node failures. For detailed ProxySQL configuration, see our MySQL connection pooling guide.

global
    maxconn 4096
    log stdout local0
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    mode tcp
    timeout connect 5000ms
    timeout client 50000ms
    timeout server 50000ms
    option tcplog

listen galera_cluster
    bind *:3306
    mode tcp
    option tcpka
    option mysql-check user monitoring password monitor_password
    balance roundrobin
    server galera-node1 203.0.113.10:3306 check weight 1
    server galera-node2 203.0.113.11:3306 check weight 1
    server galera-node3 203.0.113.12:3306 check weight 1

listen stats
    bind *:8404
    stats enable
    stats uri /stats
    stats refresh 30s

Test failover scenarios

Simulate node failures to verify automatic failover functionality and application resilience.

# Stop one node to test failover
sudo systemctl stop mariadb

# Check cluster status on remaining nodes
sudo mysql -u root -p -e "SHOW STATUS LIKE 'wsrep_cluster_size';"

# Restart failed node
sudo systemctl start mariadb

# Verify node rejoined cluster
sudo mysql -u root -p -e "SHOW STATUS LIKE 'wsrep_local_state_comment';"

Common issues

Symptom	Cause	Fix
Node fails to join cluster	Incorrect cluster address or firewall blocking ports 4567-4568	`sudo ufw allow 4567:4568/tcp` and verify `wsrep_cluster_address`
Split-brain condition	Network partition or simultaneous node restart	Check `grastate.dat` safe_to_bootstrap flag, bootstrap from most recent node
SST failure during node startup	Insufficient disk space or incorrect SST user permissions	Verify disk space and recreate SST user with correct privileges
High replication lag	Network latency or heavy write workload	Monitor `wsrep_local_recv_queue` and consider increasing `wsrep_slave_threads`
Cluster startup fails after shutdown	All nodes shutdown gracefully, need bootstrap node	Find node with highest seqno in `grastate.dat` and start with `galera_new_cluster`

Next steps

Running this in production?

Want this handled for you? Running this at scale adds a second layer of work: capacity planning, failover drills, cost control, and on-call. See how we run infrastructure like this for European teams.

Automated install script

Run this to automate the entire setup

install.sh

#!/usr/bin/env bash
set -euo pipefail

# MariaDB Galera Cluster Installation Script
# Usage: ./install_galera.sh <node_number> <node1_ip> <node2_ip> <node3_ip>

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color

# Print colored output
print_status() {
    echo -e "${GREEN}[INFO]${NC} $1"
}

print_warning() {
    echo -e "${YELLOW}[WARNING]${NC} $1"
}

print_error() {
    echo -e "${RED}[ERROR]${NC} $1"
}

# Usage message
usage() {
    echo "Usage: $0 <node_number> <node1_ip> <node2_ip> <node3_ip>"
    echo "Example: $0 1 192.168.1.10 192.168.1.11 192.168.1.12"
    echo "node_number: 1, 2, or 3 (which node this is)"
    exit 1
}

# Cleanup function
cleanup() {
    print_error "Installation failed. Cleaning up..."
    systemctl stop mariadb 2>/dev/null || true
    systemctl disable mariadb 2>/dev/null || true
    if [ "$PKG_MGR" = "apt" ]; then
        apt remove --purge -y mariadb-server galera-4 2>/dev/null || true
    else
        $PKG_INSTALL remove -y mariadb-server galera 2>/dev/null || true
    fi
    rm -f /etc/mysql/mariadb.conf.d/60-galera.cnf /etc/my.cnf.d/galera.cnf 2>/dev/null || true
}

trap cleanup ERR

# Check arguments
if [ $# -ne 4 ]; then
    usage
fi

NODE_NUM=$1
NODE1_IP=$2
NODE2_IP=$3
NODE3_IP=$4

# Validate node number
if [[ ! "$NODE_NUM" =~ ^[1-3]$ ]]; then
    print_error "Node number must be 1, 2, or 3"
    exit 1
fi

# Validate IP addresses
for ip in "$NODE1_IP" "$NODE2_IP" "$NODE3_IP"; do
    if ! [[ "$ip" =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then
        print_error "Invalid IP address: $ip"
        exit 1
    fi
done

# Set current node IP based on node number
case $NODE_NUM in
    1) CURRENT_IP=$NODE1_IP; HOSTNAME="galera-node1" ;;
    2) CURRENT_IP=$NODE2_IP; HOSTNAME="galera-node2" ;;
    3) CURRENT_IP=$NODE3_IP; HOSTNAME="galera-node3" ;;
esac

# Check if running as root
if [ "$EUID" -ne 0 ]; then
    print_error "This script must be run as root"
    exit 1
fi

echo "[1/10] Detecting distribution and package manager..."
if [ -f /etc/os-release ]; then
    . /etc/os-release
    case "$ID" in
        ubuntu|debian) 
            PKG_MGR="apt"
            PKG_INSTALL="apt install -y"
            PKG_UPDATE="apt update && apt upgrade -y"
            GALERA_CONF="/etc/mysql/mariadb.conf.d/60-galera.cnf"
            GALERA_PKG="galera-4"
            ;;
        almalinux|rocky|centos|rhel|ol|fedora)
            PKG_MGR="dnf"
            PKG_INSTALL="dnf install -y"
            PKG_UPDATE="dnf update -y"
            GALERA_CONF="/etc/my.cnf.d/galera.cnf"
            GALERA_PKG="galera"
            ;;
        amzn)
            PKG_MGR="yum"
            PKG_INSTALL="yum install -y"
            PKG_UPDATE="yum update -y"
            GALERA_CONF="/etc/my.cnf.d/galera.cnf"
            GALERA_PKG="galera"
            ;;
        *)
            print_error "Unsupported distribution: $ID"
            exit 1
            ;;
    esac
else
    print_error "Cannot detect distribution"
    exit 1
fi

print_status "Detected: $PRETTY_NAME using $PKG_MGR"

echo "[2/10] Updating system packages..."
$PKG_UPDATE

echo "[3/10] Setting hostname..."
hostnamectl set-hostname "$HOSTNAME"

echo "[4/10] Configuring /etc/hosts..."
# Remove existing galera entries
sed -i '/galera-node/d' /etc/hosts
# Add cluster nodes
cat >> /etc/hosts << EOF
$NODE1_IP  galera-node1
$NODE2_IP  galera-node2
$NODE3_IP  galera-node3
EOF

echo "[5/10] Installing MariaDB and Galera packages..."
if [ "$PKG_MGR" = "apt" ]; then
    $PKG_INSTALL mariadb-server $GALERA_PKG mariadb-client mariadb-backup rsync
else
    $PKG_INSTALL mariadb-server $GALERA_PKG mariadb rsync
fi

echo "[6/10] Configuring firewall..."
if command -v firewall-cmd &> /dev/null; then
    # RHEL-based firewall
    systemctl enable --now firewalld
    firewall-cmd --permanent --add-port=3306/tcp  # MariaDB
    firewall-cmd --permanent --add-port=4567/tcp  # Galera cluster replication
    firewall-cmd --permanent --add-port=4568/tcp  # IST
    firewall-cmd --permanent --add-port=4444/tcp  # SST
    firewall-cmd --reload
elif command -v ufw &> /dev/null; then
    # Ubuntu/Debian firewall
    ufw allow 3306/tcp
    ufw allow 4567/tcp
    ufw allow 4568/tcp
    ufw allow 4444/tcp
fi

echo "[7/10] Stopping MariaDB service..."
systemctl stop mariadb 2>/dev/null || true
systemctl disable mariadb

echo "[8/10] Creating Galera configuration..."
# Create directory if it doesn't exist
mkdir -p "$(dirname "$GALERA_CONF")"

# Determine wsrep provider path
if [ -f /usr/lib64/galera-4/libgalera_smm.so ]; then
    WSREP_PROVIDER="/usr/lib64/galera-4/libgalera_smm.so"
elif [ -f /usr/lib/galera/libgalera_smm.so ]; then
    WSREP_PROVIDER="/usr/lib/galera/libgalera_smm.so"
elif [ -f /usr/lib64/galera/libgalera_smm.so ]; then
    WSREP_PROVIDER="/usr/lib64/galera/libgalera_smm.so"
else
    print_error "Galera wsrep provider not found"
    exit 1
fi

cat > "$GALERA_CONF" << EOF
[galera]
wsrep_on = ON
wsrep_provider = $WSREP_PROVIDER
wsrep_cluster_name = "galera_cluster"
wsrep_cluster_address = "gcomm://$NODE1_IP,$NODE2_IP,$NODE3_IP"
wsrep_sst_method = rsync
wsrep_node_address = "$CURRENT_IP"
wsrep_node_name = "$HOSTNAME"
wsrep_sst_auth = sst_user:$(openssl rand -base64 12)

binlog_format = row
default_storage_engine = InnoDB
innodb_autoinc_lock_mode = 2
innodb_doublewrite = 1
query_cache_size = 0
query_cache_type = 0

[mysqld]
bind-address = 0.0.0.0
EOF

chmod 644 "$GALERA_CONF"

echo "[9/10] Node-specific setup..."
if [ "$NODE_NUM" = "1" ]; then
    print_status "Initializing bootstrap node..."
    galera_new_cluster
    
    # Wait for MariaDB to start
    sleep 5
    
    print_status "Securing MariaDB installation..."
    # Generate random root password
    ROOT_PASS=$(openssl rand -base64 16)
    mysql -u root << EOF
UPDATE mysql.user SET Password=PASSWORD('$ROOT_PASS') WHERE User='root';
DELETE FROM mysql.user WHERE User='';
DELETE FROM mysql.user WHERE User='root' AND Host NOT IN ('localhost', '127.0.0.1', '::1');
DROP DATABASE IF EXISTS test;
DELETE FROM mysql.db WHERE Db='test' OR Db='test\\_%';
CREATE USER 'sst_user'@'localhost' IDENTIFIED BY '$(grep wsrep_sst_auth $GALERA_CONF | cut -d: -f2)';
GRANT PROCESS, RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sst_user'@'localhost';
CREATE USER 'monitoring'@'%' IDENTIFIED BY '$(openssl rand -base64 12)';
GRANT PROCESS, REPLICATION CLIENT ON *.* TO 'monitoring'@'%';
FLUSH PRIVILEGES;
EOF

    echo "MariaDB root password: $ROOT_PASS" > /root/.mariadb_root_password
    chmod 600 /root/.mariadb_root_password
    
    systemctl enable mariadb
    print_status "Bootstrap node configured. Root password saved to /root/.mariadb_root_password"
else
    print_warning "This is node $NODE_NUM. Start MariaDB after the bootstrap node (node 1) is running:"
    print_warning "systemctl start mariadb && systemctl enable mariadb"
fi

echo "[10/10] Verifying installation..."
if systemctl is-active --quiet mariadb; then
    print_status "MariaDB is running"
    mysql -u root -p$(cat /root/.mariadb_root_password 2>/dev/null || echo '') -e "SHOW STATUS LIKE 'wsrep_cluster_size';" 2>/dev/null || true
else
    print_status "MariaDB is configured but not running (normal for non-bootstrap nodes)"
fi

print_status "Galera cluster node $NODE_NUM installation complete!"
print_status "Configuration file: $GALERA_CONF"

Review the script before running. Execute with: bash install.sh

#mariadb #galera #clustering #replication #high-availability

Configure MariaDB Galera cluster for multi-master replication with automatic failover