Deploy CockroachDB across multiple regions with automated failover, data locality controls, and production-grade security. Includes replication zones, load balancing, and disaster recovery strategies.
Prerequisites
- 9 servers across 3 regions
- Root access
- Network connectivity between regions
- Minimum 4GB RAM per node
What this solves
CockroachDB's multi-region clustering provides distributed SQL with automatic failover, consistent backups across regions, and data locality controls for regulatory compliance. You need this when your application serves users globally and requires sub-second query response times with zero-downtime failover.
Multi-region cluster architecture and planning
Plan your cluster topology
CockroachDB requires an odd number of nodes for consensus. For production multi-region deployment, use minimum 3 regions with 3 nodes each for optimal fault tolerance.
# Region 1 (Primary): us-east-1
Nodes: cockroach-1, cockroach-2, cockroach-3
Region 2 (Secondary): eu-west-1
Nodes: cockroach-4, cockroach-5, cockroach-6
Region 3 (Tertiary): ap-southeast-1
Nodes: cockroach-7, cockroach-8, cockroach-9
Configure network requirements
Each node needs bidirectional connectivity on ports 26257 (SQL) and 8080 (Admin UI). Plan for 100ms RTT maximum between regions.
# Required ports per node:
26257: SQL and inter-node communication
8080: Admin UI (optional, can be disabled)
Bandwidth: 10Mbps minimum per node
Latency: <100ms between regions
Install and configure CockroachDB nodes across regions
Install CockroachDB on all nodes
Download and install the latest CockroachDB binary on each server across all regions.
wget -qO- https://binaries.cockroachdb.com/cockroach-v24.3.0.linux-amd64.tgz | tar xz
sudo cp -i cockroach-v24.3.0.linux-amd64/cockroach /usr/local/bin/
sudo mkdir -p /usr/local/lib/cockroach
sudo cp -i cockroach-v24.3.0.linux-amd64/lib/libgeos.so /usr/local/lib/cockroach/
sudo cp -i cockroach-v24.3.0.linux-amd64/lib/libgeos_c.so /usr/local/lib/cockroach/
Create CockroachDB user and directories
Set up the cockroach user and create necessary directories with proper permissions.
sudo useradd -r -s /bin/bash -d /var/lib/cockroach cockroach
sudo mkdir -p /var/lib/cockroach /var/log/cockroach /etc/cockroach
sudo chown cockroach:cockroach /var/lib/cockroach /var/log/cockroach /etc/cockroach
sudo chmod 755 /var/lib/cockroach /var/log/cockroach
sudo chmod 750 /etc/cockroach
Generate cluster certificates
Create a certificate authority and node certificates for secure inter-node communication. Run this on your first node, then distribute certificates.
sudo -u cockroach mkdir -p /var/lib/cockroach/certs /var/lib/cockroach/my-safe-directory
cd /var/lib/cockroach
Create CA certificate
sudo -u cockroach cockroach cert create-ca --certs-dir=certs --ca-key=my-safe-directory/ca.key
Create node certificates for each region
sudo -u cockroach cockroach cert create-node localhost 127.0.0.1 10.0.1.10 10.0.1.11 10.0.1.12 cockroach-1 cockroach-2 cockroach-3 --certs-dir=certs --ca-key=my-safe-directory/ca.key
Create client certificate
sudo -u cockroach cockroach cert create-client root --certs-dir=certs --ca-key=my-safe-directory/ca.key
Distribute certificates to all nodes
Copy the certificates to all nodes in your multi-region cluster. Replace the IP addresses with your actual node IPs.
# From the first node, copy certificates to all other nodes
for node in cockroach-2 cockroach-3 cockroach-4 cockroach-5 cockroach-6 cockroach-7 cockroach-8 cockroach-9; do
scp -r /var/lib/cockroach/certs root@${node}:/tmp/
ssh root@${node} "sudo mv /tmp/certs /var/lib/cockroach/ && sudo chown -R cockroach:cockroach /var/lib/cockroach/certs"
done
Set up inter-region networking and security
Configure firewall rules
Open required ports for CockroachDB communication between all nodes in your cluster.
sudo ufw allow 26257/tcp comment 'CockroachDB SQL'
sudo ufw allow 8080/tcp comment 'CockroachDB Admin UI'
sudo ufw enable
Create systemd service files
Configure systemd service for each CockroachDB node with region-specific locality settings.
[Unit]
Description=CockroachDB database server
Requires=network.target
[Service]
Type=notify
User=cockroach
ExecStart=/usr/local/bin/cockroach start \
--certs-dir=/var/lib/cockroach/certs \
--advertise-addr=%NODE_IP% \
--join=10.0.1.10:26257,10.0.2.10:26257,10.0.3.10:26257 \
--locality=region=%REGION%,zone=%ZONE% \
--cache=.25 \
--max-sql-memory=.25 \
--store=/var/lib/cockroach/data \
--log-dir=/var/log/cockroach
TimeoutStopSec=60
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal
SyslogIdentifier=cockroach
KillSignal=SIGTERM
[Install]
WantedBy=multi-user.target
Configure locality for each node
Edit the service file on each node to set appropriate region and zone values for data locality.
# Region 1 nodes (us-east-1)
Node 1: --locality=region=us-east-1,zone=us-east-1a
Node 2: --locality=region=us-east-1,zone=us-east-1b
Node 3: --locality=region=us-east-1,zone=us-east-1c
Region 2 nodes (eu-west-1)
Node 4: --locality=region=eu-west-1,zone=eu-west-1a
Node 5: --locality=region=eu-west-1,zone=eu-west-1b
Node 6: --locality=region=eu-west-1,zone=eu-west-1c
Region 3 nodes (ap-southeast-1)
Node 7: --locality=region=ap-southeast-1,zone=ap-southeast-1a
Node 8: --locality=region=ap-southeast-1,zone=ap-southeast-1b
Node 9: --locality=region=ap-southeast-1,zone=ap-southeast-1c
Start the cluster
Start CockroachDB on all nodes, then initialize the cluster from any node.
# Start services on all nodes
sudo systemctl daemon-reload
sudo systemctl enable --now cockroachdb
Initialize cluster (run only once from any node)
sudo -u cockroach cockroach init --certs-dir=/var/lib/cockroach/certs --host=localhost:26257
Check cluster status
sudo -u cockroach cockroach node status --certs-dir=/var/lib/cockroach/certs --host=localhost:26257
Configure replication zones and data locality
Create regional databases
Configure databases with regional placement to optimize query performance and meet compliance requirements.
# Connect to cluster
sudo -u cockroach cockroach sql --certs-dir=/var/lib/cockroach/certs --host=localhost:26257
Create regional databases
CREATE DATABASE app_us_east PRIMARY REGION "us-east-1" REGIONS "eu-west-1", "ap-southeast-1";
CREATE DATABASE app_eu_west PRIMARY REGION "eu-west-1" REGIONS "us-east-1", "ap-southeast-1";
CREATE DATABASE app_ap_southeast PRIMARY REGION "ap-southeast-1" REGIONS "us-east-1", "eu-west-1";
Configure survival goals
Set appropriate survival goals based on your availability requirements and compliance needs.
# Set zone survival (survives single zone failure)
ALTER DATABASE app_us_east SURVIVE ZONE FAILURE;
Set region survival (survives region failure, requires 3+ regions)
ALTER DATABASE app_global SURVIVE REGION FAILURE;
Verify configuration
SHOW DATABASES;
Create regional tables
Design tables with appropriate locality patterns for your data access patterns.
# Regional table (all data in one region)
USE app_us_east;
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email STRING NOT NULL,
created_at TIMESTAMP DEFAULT now()
) LOCALITY REGIONAL BY TABLE IN "us-east-1";
Global table (replicated to all regions)
CREATE TABLE products (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name STRING NOT NULL,
price DECIMAL(10,2)
) LOCALITY GLOBAL;
Regional by row table (data follows a column value)
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
region STRING NOT NULL,
user_id UUID NOT NULL,
total DECIMAL(10,2),
created_at TIMESTAMP DEFAULT now()
) LOCALITY REGIONAL BY ROW;
Implement load balancing and failover automation
Install and configure HAProxy
Set up HAProxy for connection load balancing across CockroachDB nodes with automatic failover detection.
sudo apt update
sudo apt install -y haproxy
Configure HAProxy for CockroachDB
Create HAProxy configuration with health checks and region-aware backend selection. This tutorial builds on the concepts from HAProxy with keepalived monitoring.
global
daemon
user haproxy
group haproxy
log 127.0.0.1:514 local0
maxconn 4096
defaults
mode tcp
timeout connect 10s
timeout client 1m
timeout server 1m
option tcplog
log global
listen cockroachdb-sql
bind *:5432
balance roundrobin
option httpchk GET /health?ready=1
server cockroach-1 10.0.1.10:26257 check port 8080
server cockroach-2 10.0.1.11:26257 check port 8080
server cockroach-3 10.0.1.12:26257 check port 8080
server cockroach-4 10.0.2.10:26257 check port 8080 backup
server cockroach-5 10.0.2.11:26257 check port 8080 backup
server cockroach-6 10.0.2.12:26257 check port 8080 backup
server cockroach-7 10.0.3.10:26257 check port 8080 backup
server cockroach-8 10.0.3.11:26257 check port 8080 backup
server cockroach-9 10.0.3.12:26257 check port 8080 backup
listen cockroachdb-ui
bind *:8080
balance roundrobin
option httpchk GET /health
server ui-1 10.0.1.10:8080 check
server ui-2 10.0.1.11:8080 check
server ui-3 10.0.1.12:8080 check
stats enable
stats uri /stats
stats refresh 30s
Enable and start HAProxy
Start HAProxy and verify it can connect to your CockroachDB cluster.
sudo systemctl enable --now haproxy
sudo systemctl status haproxy
Test connection through HAProxy
sudo -u cockroach cockroach sql --certs-dir=/var/lib/cockroach/certs --host=localhost:5432 --execute="SELECT version();"
Configure application connection pooling
Install PgBouncer for connection pooling between applications and CockroachDB. This follows the patterns from PostgreSQL with PgBouncer.
sudo apt install -y pgbouncer
Configure PgBouncer for CockroachDB
Set up PgBouncer configuration optimized for CockroachDB connection patterns.
[databases]
app_us_east = host=localhost port=5432 dbname=app_us_east
app_eu_west = host=localhost port=5432 dbname=app_eu_west
app_ap_southeast = host=localhost port=5432 dbname=app_ap_southeast
[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6543
auth_type = cert
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = session
max_client_conn = 1000
default_pool_size = 25
max_db_connections = 100
reserve_pool_size = 5
reserve_pool_timeout = 5
server_reset_query = DISCARD ALL
server_lifetime = 3600
server_idle_timeout = 600
client_idle_timeout = 0
stats_users = stats
admin_users = admin
ignore_startup_parameters = extra_float_digits
log_connections = 1
log_disconnections = 1
log_pooler_errors = 1
Monitor multi-region cluster performance
Install Prometheus for metrics collection
Set up Prometheus to collect CockroachDB metrics for multi-region monitoring.
sudo useradd -r -s /bin/false prometheus
wget https://github.com/prometheus/prometheus/releases/download/v2.48.0/prometheus-2.48.0.linux-amd64.tar.gz
tar -xzf prometheus-2.48.0.linux-amd64.tar.gz
sudo cp prometheus-2.48.0.linux-amd64/prometheus /usr/local/bin/
sudo cp prometheus-2.48.0.linux-amd64/promtool /usr/local/bin/
sudo mkdir -p /etc/prometheus /var/lib/prometheus
sudo chown prometheus:prometheus /etc/prometheus /var/lib/prometheus
Configure Prometheus for CockroachDB
Create Prometheus configuration to scrape metrics from all CockroachDB nodes across regions.
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'cockroachdb'
static_configs:
- targets:
- '10.0.1.10:8080'
- '10.0.1.11:8080'
- '10.0.1.12:8080'
- '10.0.2.10:8080'
- '10.0.2.11:8080'
- '10.0.2.12:8080'
- '10.0.3.10:8080'
- '10.0.3.11:8080'
- '10.0.3.12:8080'
metrics_path: '/_status/vars'
scrape_interval: 10s
scrape_timeout: 5s
- job_name: 'haproxy'
static_configs:
- targets: ['localhost:8404']
metrics_path: '/metrics'
Create Prometheus systemd service
Configure Prometheus as a systemd service for automatic startup and management.
[Unit]
Description=Prometheus Server
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
Restart=on-failure
RestartSec=5s
ExecStart=/usr/local/bin/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries \
--web.listen-address=0.0.0.0:9090 \
--web.enable-lifecycle \
--log.level=info
[Install]
WantedBy=multi-user.target
Install and configure Grafana
Set up Grafana for visualizing CockroachDB multi-region metrics. This extends the monitoring approach from FastAPI monitoring with Prometheus.
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list
sudo apt update
sudo apt install -y grafana
Start monitoring services
Enable and start Prometheus and Grafana services.
sudo systemctl enable --now prometheus
sudo systemctl enable --now grafana-server
Verify services
sudo systemctl status prometheus
sudo systemctl status grafana-server
Backup and disaster recovery strategies
Configure automated backups
Set up regular backups to cloud storage with encryption and regional distribution.
# Create backup schedule (requires enterprise license or core backup)
CREATE SCHEDULE weekly_backup
FOR BACKUP INTO 's3://backup-bucket/cockroachdb?AUTH=specified&AWS_ACCESS_KEY_ID=key&AWS_SECRET_ACCESS_KEY=secret'
RECURRING '@weekly'
FULL BACKUP '@weekly'
WITH SCHEDULE OPTIONS first_run = 'now';
Create backup script for core version
For CockroachDB Core, create a script for regular SQL dumps with compression.
#!/bin/bash
set -euo pipefail
BACKUP_DIR="/var/backups/cockroachdb"
DATE=$(date +%Y%m%d_%H%M%S)
CERTS_DIR="/var/lib/cockroach/certs"
HOST="localhost:26257"
Create backup directory
mkdir -p "$BACKUP_DIR"
Export each database
for db in app_us_east app_eu_west app_ap_southeast; do
echo "Backing up database: $db"
cockroach dump "$db" \
--certs-dir="$CERTS_DIR" \
--host="$HOST" | gzip > "$BACKUP_DIR/${db}_${DATE}.sql.gz"
done
Export cluster settings and schema
echo "Exporting cluster metadata"
cockroach node status --certs-dir="$CERTS_DIR" --host="$HOST" > "$BACKUP_DIR/nodes_${DATE}.txt"
cockroach zone ls --certs-dir="$CERTS_DIR" --host="$HOST" > "$BACKUP_DIR/zones_${DATE}.txt"
Cleanup old backups (keep 7 days)
find "$BACKUP_DIR" -name "*.sql.gz" -mtime +7 -delete
find "$BACKUP_DIR" -name "*.txt" -mtime +7 -delete
echo "Backup completed: $DATE"
Set backup permissions and schedule
Configure proper permissions and create a cron job for automated backups.
sudo chmod +x /usr/local/bin/cockroach-backup.sh
sudo chown cockroach:cockroach /usr/local/bin/cockroach-backup.sh
sudo mkdir -p /var/backups/cockroachdb
sudo chown cockroach:cockroach /var/backups/cockroachdb
Add to cockroach user's crontab
sudo -u cockroach crontab -e
Add this line: 0 2 * /usr/local/bin/cockroach-backup.sh >> /var/log/cockroach/backup.log 2>&1
Test disaster recovery
Create a disaster recovery test procedure to verify backup integrity and recovery time.
# Test restore procedure (on test cluster)
1. Stop CockroachDB on test nodes
sudo systemctl stop cockroachdb
2. Clear data directory
sudo rm -rf /var/lib/cockroach/data/*
3. Start cluster and restore
sudo systemctl start cockroachdb
cockroach init --certs-dir=/var/lib/cockroach/certs --host=localhost:26257
4. Restore from backup
zcat /var/backups/cockroachdb/app_us_east_20240101_020000.sql.gz | \
cockroach sql --certs-dir=/var/lib/cockroach/certs --host=localhost:26257
5. Verify data integrity
cockroach sql --certs-dir=/var/lib/cockroach/certs --host=localhost:26257 \
--execute="SELECT count(*) FROM app_us_east.users;"
Verify your setup
# Check cluster health
sudo -u cockroach cockroach node status --certs-dir=/var/lib/cockroach/certs --host=localhost:26257
Verify replication
sudo -u cockroach cockroach sql --certs-dir=/var/lib/cockroach/certs --host=localhost:26257 \
--execute="SHOW RANGES FROM DATABASE app_us_east;"
Test HAProxy connection
psql -h localhost -p 5432 -d app_us_east -c "SELECT version();"
Check backup schedule
sudo -u cockroach cockroach sql --certs-dir=/var/lib/cockroach/certs --host=localhost:26257 \
--execute="SHOW SCHEDULES;"
Verify monitoring endpoints
curl http://localhost:9090/api/v1/query?query=up
curl http://localhost:3000/api/health
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| Node won't join cluster | Clock skew or network connectivity | Sync time with sudo chrony sources -v and check firewall rules |
| High cross-region latency | Suboptimal table locality | Review table locality settings and move data closer to users |
| Certificate errors | Expired or missing certificates | Regenerate certificates and redistribute to all nodes |
| Backup failures | Insufficient disk space or permissions | Check df -h /var/backups and verify cockroach user permissions |
| HAProxy connection refused | Health check failing | Verify CockroachDB admin UI accessible on port 8080 |
Next steps
- Optimize application connections with advanced PgBouncer configuration
- Set up real-time replication monitoring and alerting
- Implement encrypted backups with GPG for compliance requirements
- Advanced performance tuning for high-throughput workloads
Running this in production?
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
# Configuration
COCKROACH_VERSION="24.3.0"
COCKROACH_USER="cockroach"
COCKROACH_HOME="/var/lib/cockroach"
CERTS_DIR="$COCKROACH_HOME/certs"
DATA_DIR="$COCKROACH_HOME/data"
LOG_DIR="/var/log/cockroach"
BACKUP_DIR="/var/backups/cockroachdb"
usage() {
echo "Usage: $0 <node-id> <join-address> [advertise-address]"
echo " node-id: Unique identifier for this node (1-9)"
echo " join-address: Address of existing cluster node or first node address"
echo " advertise-address: External IP address (defaults to primary interface IP)"
exit 1
}
log() {
echo -e "${GREEN}[INFO]${NC} $1"
}
warn() {
echo -e "${YELLOW}[WARN]${NC} $1"
}
error() {
echo -e "${RED}[ERROR]${NC} $1"
exit 1
}
cleanup() {
warn "Installation failed. Cleaning up..."
systemctl stop cockroachdb 2>/dev/null || true
userdel $COCKROACH_USER 2>/dev/null || true
rm -rf $COCKROACH_HOME $LOG_DIR /etc/systemd/system/cockroachdb.service
}
trap cleanup ERR
# Validate arguments
[[ $# -lt 2 || $# -gt 3 ]] && usage
NODE_ID="$1"
JOIN_ADDRESS="$2"
ADVERTISE_ADDRESS="${3:-$(ip route get 8.8.8.8 | awk '/src/{print $7}' | head -1)}"
# Validate node ID
[[ ! "$NODE_ID" =~ ^[1-9]$ ]] && error "Node ID must be between 1-9"
# Check prerequisites
[[ $EUID -ne 0 ]] && error "This script must be run as root"
# Detect distribution
if [ -f /etc/os-release ]; then
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_INSTALL="apt install -y"
PKG_UPDATE="apt update"
FIREWALL_CMD="ufw"
;;
almalinux|rocky|centos|rhel|ol|fedora)
PKG_MGR="dnf"
PKG_INSTALL="dnf install -y"
PKG_UPDATE="dnf update -y"
FIREWALL_CMD="firewall-cmd"
;;
amzn)
PKG_MGR="yum"
PKG_INSTALL="yum install -y"
PKG_UPDATE="yum update -y"
FIREWALL_CMD="firewall-cmd"
;;
*) error "Unsupported distribution: $ID" ;;
esac
else
error "Cannot detect distribution"
fi
log "[1/12] Updating system packages..."
$PKG_UPDATE
$PKG_INSTALL wget curl ca-certificates
log "[2/12] Creating cockroach user..."
if ! id $COCKROACH_USER >/dev/null 2>&1; then
useradd --system --shell /bin/bash --home $COCKROACH_HOME --create-home $COCKROACH_USER
fi
log "[3/12] Creating directory structure..."
mkdir -p $COCKROACH_HOME $CERTS_DIR $DATA_DIR $LOG_DIR $BACKUP_DIR
chown -R $COCKROACH_USER:$COCKROACH_USER $COCKROACH_HOME $LOG_DIR $BACKUP_DIR
chmod 755 $COCKROACH_HOME $LOG_DIR $BACKUP_DIR
chmod 700 $CERTS_DIR $DATA_DIR
log "[4/12] Downloading CockroachDB $COCKROACH_VERSION..."
cd /tmp
wget -q "https://binaries.cockroachdb.com/cockroach-v$COCKROACH_VERSION.linux-amd64.tgz"
tar -xzf "cockroach-v$COCKROACH_VERSION.linux-amd64.tgz"
cp "cockroach-v$COCKROACH_VERSION.linux-amd64/cockroach" /usr/local/bin/
chmod 755 /usr/local/bin/cockroach
rm -rf /tmp/cockroach-v*
log "[5/12] Generating certificates..."
if [[ ! -f "$CERTS_DIR/ca.crt" ]]; then
cd $CERTS_DIR
sudo -u $COCKROACH_USER cockroach cert create-ca --certs-dir=$CERTS_DIR --ca-key=$CERTS_DIR/ca.key
sudo -u $COCKROACH_USER cockroach cert create-node localhost $ADVERTISE_ADDRESS $(hostname) --certs-dir=$CERTS_DIR --ca-key=$CERTS_DIR/ca.key
sudo -u $COCKROACH_USER cockroach cert create-client root --certs-dir=$CERTS_DIR --ca-key=$CERTS_DIR/ca.key
fi
log "[6/12] Creating systemd service..."
cat > /etc/systemd/system/cockroachdb.service << EOF
[Unit]
Description=CockroachDB database server
Requires=network.target
[Service]
Type=notify
WorkingDirectory=$COCKROACH_HOME
ExecStart=/usr/local/bin/cockroach start \
--certs-dir=$CERTS_DIR \
--advertise-addr=$ADVERTISE_ADDRESS:26257 \
--http-addr=$ADVERTISE_ADDRESS:8080 \
--listen-addr=$ADVERTISE_ADDRESS:26257 \
--cache=.25 \
--max-sql-memory=.25 \
--store=$DATA_DIR \
--join=$JOIN_ADDRESS:26257 \
--log-dir=$LOG_DIR
TimeoutStopSec=60
Restart=always
RestartSec=10
StandardOutput=journal
StandardError=journal
SyslogIdentifier=cockroach
User=$COCKROACH_USER
Group=$COCKROACH_USER
[Install]
WantedBy=default.target
EOF
systemctl daemon-reload
systemctl enable cockroachdb
log "[7/12] Configuring firewall..."
case "$PKG_MGR" in
apt)
$PKG_INSTALL ufw
ufw --force enable
ufw allow 26257/tcp
ufw allow 8080/tcp
;;
*)
systemctl enable --now firewalld
firewall-cmd --permanent --add-port=26257/tcp
firewall-cmd --permanent --add-port=8080/tcp
firewall-cmd --reload
;;
esac
log "[8/12] Setting up time synchronization..."
case "$PKG_MGR" in
apt)
$PKG_INSTALL chrony
systemctl enable --now chrony
;;
*)
$PKG_INSTALL chrony
systemctl enable --now chronyd
;;
esac
log "[9/12] Starting CockroachDB..."
systemctl start cockroachdb
sleep 10
log "[10/12] Creating backup script..."
cat > /usr/local/bin/cockroach-backup.sh << 'EOF'
#!/bin/bash
set -euo pipefail
DATE=$(date +%Y%m%d_%H%M%S)
CERTS_DIR="/var/lib/cockroach/certs"
BACKUP_DIR="/var/backups/cockroachdb"
HOST="localhost:26257"
mkdir -p "$BACKUP_DIR"
for db in $(cockroach sql --certs-dir="$CERTS_DIR" --host="$HOST" --format=tsv --execute="SHOW DATABASES;" | grep -v "information_schema\|pg_catalog\|pg_extension\|crdb_internal\|system"); do
cockroach dump "$db" --certs-dir="$CERTS_DIR" --host="$HOST" | gzip > "$BACKUP_DIR/${db}_${DATE}.sql.gz"
done
cockroach node status --certs-dir="$CERTS_DIR" --host="$HOST" > "$BACKUP_DIR/nodes_${DATE}.txt"
find "$BACKUP_DIR" -name "*.sql.gz" -mtime +7 -delete
find "$BACKUP_DIR" -name "*.txt" -mtime +7 -delete
EOF
chmod 755 /usr/local/bin/cockroach-backup.sh
chown $COCKROACH_USER:$COCKROACH_USER /usr/local/bin/cockroach-backup.sh
log "[11/12] Setting up backup cron job..."
sudo -u $COCKROACH_USER crontab -l 2>/dev/null | { cat; echo "0 2 * * * /usr/local/bin/cockroach-backup.sh >> $LOG_DIR/backup.log 2>&1"; } | sudo -u $COCKROACH_USER crontab -
log "[12/12] Verifying installation..."
sleep 5
systemctl is-active --quiet cockroachdb || error "CockroachDB service is not running"
# Initialize cluster on first node
if [[ "$ADVERTISE_ADDRESS" == "$JOIN_ADDRESS" ]] || [[ "$NODE_ID" == "1" ]]; then
log "Initializing cluster (first node)..."
timeout 30 sudo -u $COCKROACH_USER cockroach init --certs-dir=$CERTS_DIR --host=localhost:26257 || warn "Cluster may already be initialized"
fi
# Verify node status
sleep 10
sudo -u $COCKROACH_USER cockroach node status --certs-dir=$CERTS_DIR --host=localhost:26257
log "CockroachDB installation completed successfully!"
log "Node ID: $NODE_ID"
log "Advertise Address: $ADVERTISE_ADDRESS:26257"
log "Admin UI: https://$ADVERTISE_ADDRESS:8080"
log "Data Directory: $DATA_DIR"
log "Certificates: $CERTS_DIR"
log "Logs: $LOG_DIR"
Review the script before running. Execute with: bash install.sh