Install and configure CockroachDB cluster with high availability and distributed SQL

Intermediate 45 min Apr 03, 2026 25 views
Ubuntu 24.04 Ubuntu 22.04 Debian 12 AlmaLinux 9 Rocky Linux 9 Fedora 41

Set up a production-ready CockroachDB cluster with SSL certificates, load balancing, and automated backups for distributed SQL workloads. This tutorial covers multi-node deployment, security hardening, and performance optimization for enterprise environments.

Prerequisites

  • Root or sudo access
  • At least 3 servers for production cluster
  • 8GB RAM minimum per node
  • Fast SSD storage recommended
  • Stable network connectivity between nodes

What this solves

CockroachDB provides horizontally scalable, distributed SQL capabilities with automatic failover and strong consistency guarantees. This tutorial helps you deploy a production-ready CockroachDB cluster with SSL encryption, load balancing, and automated backup strategies for mission-critical applications requiring global distribution and high availability.

Step-by-step installation

Update system packages and install dependencies

Start by updating your package manager and installing required dependencies for CockroachDB cluster deployment.

sudo apt update && sudo apt upgrade -y
sudo apt install -y wget curl gnupg2 software-properties-common
sudo dnf update -y
sudo dnf install -y wget curl gnupg2

Download and install CockroachDB binary

Download the latest CockroachDB release and install it system-wide with proper permissions.

cd /tmp
wget https://binaries.cockroachdb.com/cockroach-v23.2.4.linux-amd64.tgz
tar -xzf cockroach-v23.2.4.linux-amd64.tgz
sudo cp cockroach-v23.2.4.linux-amd64/cockroach /usr/local/bin/
sudo chmod 755 /usr/local/bin/cockroach

Create CockroachDB user and directories

Create a dedicated system user for CockroachDB and set up the required directory structure with secure permissions.

sudo useradd --system --shell /bin/false --home-dir /var/lib/cockroach cockroach
sudo mkdir -p /var/lib/cockroach/{data,certs,logs}
sudo mkdir -p /etc/cockroach
sudo chown -R cockroach:cockroach /var/lib/cockroach
sudo chmod 700 /var/lib/cockroach/data
sudo chmod 755 /var/lib/cockroach/certs

Generate SSL certificates for secure cluster communication

Create a Certificate Authority and generate SSL certificates for secure inter-node communication and client connections.

cd /var/lib/cockroach/certs
sudo -u cockroach /usr/local/bin/cockroach cert create-ca --certs-dir=/var/lib/cockroach/certs --ca-key=/var/lib/cockroach/certs/ca.key
sudo -u cockroach /usr/local/bin/cockroach cert create-node node1.example.com node2.example.com node3.example.com localhost 127.0.0.1 203.0.113.10 203.0.113.11 203.0.113.12 --certs-dir=/var/lib/cockroach/certs --ca-key=/var/lib/cockroach/certs/ca.key
sudo -u cockroach /usr/local/bin/cockroach cert create-client root --certs-dir=/var/lib/cockroach/certs --ca-key=/var/lib/cockroach/certs/ca.key
Note: Replace the hostnames and IP addresses with your actual node addresses. Copy the generated certificates to all cluster nodes using secure methods like scp with proper key authentication.

Configure CockroachDB cluster settings

Create the main configuration file with cluster settings, performance tuning, and security parameters.

# CockroachDB Cluster Configuration
[logging]
level = info
file-defaults:
  dir = /var/lib/cockroach/logs
  max-file-size = 100MiB
  max-group-size = 1GiB
  file-defaults:
    format = crdb-v2-tty

[cluster]
name = production-cluster
organization = example-corp

[security]
max-offset = 500ms
allow-node-client-cert = true

[performance]
cache-size = 4GiB
max-sql-memory = 8GiB
max-disk-temp-storage = 32GiB

Create systemd service for cluster management

Set up a systemd service file for automatic startup, restart policies, and resource management.

[Unit]
Description=CockroachDB Distributed SQL Database
Requires=network.target
After=network.target

[Service]
Type=notify
User=cockroach
Group=cockroach
ExecStart=/usr/local/bin/cockroach start \
  --certs-dir=/var/lib/cockroach/certs \
  --store=/var/lib/cockroach/data \
  --listen-addr=0.0.0.0:26257 \
  --http-addr=0.0.0.0:8080 \
  --join=node1.example.com:26257,node2.example.com:26257,node3.example.com:26257 \
  --cache=4GiB \
  --max-sql-memory=8GiB \
  --background
ExecReload=/bin/kill -HUP $MAINPID
KillMode=control-group
KillSignal=SIGINT
TimeoutStopSec=60
Restart=always
RestartSec=10
SyslogIdentifier=cockroachdb
LimitNOFILE=65536
LimitMEMLOCK=infinity

[Install]
WantedBy=multi-user.target

Configure firewall rules for cluster communication

Open required ports for CockroachDB cluster communication and admin interface access.

sudo ufw allow 26257/tcp comment 'CockroachDB SQL'
sudo ufw allow 8080/tcp comment 'CockroachDB Admin UI'
sudo ufw allow from 203.0.113.0/24 to any port 26257
sudo ufw reload
sudo firewall-cmd --permanent --add-port=26257/tcp
sudo firewall-cmd --permanent --add-port=8080/tcp
sudo firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="203.0.113.0/24" port protocol="tcp" port="26257" accept'
sudo firewall-cmd --reload

Initialize the cluster on the first node

Start the CockroachDB service on the first node and initialize the cluster with security settings.

sudo systemctl daemon-reload
sudo systemctl enable cockroachdb
sudo systemctl start cockroachdb
sudo systemctl status cockroachdb

Initialize cluster database

Once the first node is running, initialize the cluster and create the initial database structure.

sudo -u cockroach /usr/local/bin/cockroach init \
  --certs-dir=/var/lib/cockroach/certs \
  --host=localhost:26257

Set up HAProxy load balancer

Configure HAProxy to distribute SQL connections across cluster nodes with health checks and failover support. This integrates with our HAProxy high availability setup.

sudo apt install -y haproxy
sudo dnf install -y haproxy
global
    daemon
    user haproxy
    group haproxy
    log stdout local0
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s

defaults
    mode tcp
    timeout connect 5000ms
    timeout client 50000ms
    timeout server 50000ms
    option dontlognull
    option redispatch
    retries 3
    maxconn 3000

listen cockroachdb-sql
    bind *:5432
    mode tcp
    option tcplog
    option httpchk GET /health?ready=1
    server node1 node1.example.com:26257 check port 8080 check-ssl verify none
    server node2 node2.example.com:26257 check port 8080 check-ssl verify none
    server node3 node3.example.com:26257 check port 8080 check-ssl verify none

listen cockroachdb-admin
    bind *:8081
    mode tcp
    option tcplog
    server node1 node1.example.com:8080 check
    server node2 node2.example.com:8080 check backup
    server node3 node3.example.com:8080 check backup

listen stats
    bind *:1936
    mode http
    stats enable
    stats uri /stats
    stats refresh 30s

Create database users and security policies

Connect to the cluster and create application users with appropriate privileges and security constraints.

sudo -u cockroach /usr/local/bin/cockroach sql \
  --certs-dir=/var/lib/cockroach/certs \
  --host=localhost:26257 \
  --execute="CREATE USER appuser WITH PASSWORD 'SecureP@ssw0rd123!';"

sudo -u cockroach /usr/local/bin/cockroach sql \
  --certs-dir=/var/lib/cockroach/certs \
  --host=localhost:26257 \
  --execute="CREATE DATABASE production;"

sudo -u cockroach /usr/local/bin/cockroach sql \
  --certs-dir=/var/lib/cockroach/certs \
  --host=localhost:26257 \
  --execute="GRANT ALL ON DATABASE production TO appuser;"

sudo -u cockroach /usr/local/bin/cockroach sql \
  --certs-dir=/var/lib/cockroach/certs \
  --host=localhost:26257 \
  --execute="SET CLUSTER SETTING server.host_based_authentication.configuration = 'host all all 0.0.0.0/0 cert-password';"

Configure automated backup system

Set up automated daily backups with retention policies and monitoring. This works with our backup automation strategies.

#!/bin/bash

CockroachDB Backup Script

set -euo pipefail BACKUP_DIR="/var/backups/cockroach" LOG_FILE="/var/log/cockroach-backup.log" RETENTION_DAYS=30 DATE=$(date +"%Y-%m-%d_%H-%M-%S")

Create backup directory

mkdir -p "$BACKUP_DIR"

Function to log messages

log_message() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE" } log_message "Starting CockroachDB backup"

Perform backup

if /usr/local/bin/cockroach dump production \ --certs-dir=/var/lib/cockroach/certs \ --host=localhost:26257 > "$BACKUP_DIR/backup_$DATE.sql"; then log_message "Backup completed successfully: backup_$DATE.sql" # Compress backup gzip "$BACKUP_DIR/backup_$DATE.sql" log_message "Backup compressed: backup_$DATE.sql.gz" else log_message "ERROR: Backup failed" exit 1 fi

Clean old backups

find "$BACKUP_DIR" -name "backup_*.sql.gz" -mtime +$RETENTION_DAYS -delete log_message "Cleaned backups older than $RETENTION_DAYS days" log_message "Backup process completed"
sudo chmod 755 /usr/local/bin/cockroach-backup.sh
sudo chown cockroach:cockroach /usr/local/bin/cockroach-backup.sh

Create systemd timer for automated backups

Set up automated backup scheduling with systemd timers and proper logging.

[Unit]
Description=CockroachDB Backup Service
Wants=cockroachdb.service
After=cockroachdb.service

[Service]
Type=oneshot
User=cockroach
Group=cockroach
ExecStart=/usr/local/bin/cockroach-backup.sh
StandardOutput=append:/var/log/cockroach-backup.log
StandardError=append:/var/log/cockroach-backup.log
[Unit]
Description=Run CockroachDB backup daily
Requires=cockroach-backup.service

[Timer]
OnCalendar=daily
RandomizedDelaySec=1800
Persistent=true

[Install]
WantedBy=timers.target
sudo systemctl daemon-reload
sudo systemctl enable cockroach-backup.timer
sudo systemctl start cockroach-backup.timer

Configure cluster monitoring and alerting

Set up comprehensive monitoring with custom metrics collection for cluster health and performance tracking.

#!/bin/bash

CockroachDB Health Check Script

set -euo pipefail LOG_FILE="/var/log/cockroach-health.log" ALERT_THRESHOLD_CPU=80 ALERT_THRESHOLD_MEMORY=85 log_message() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE" }

Check cluster status

if ! /usr/local/bin/cockroach node status \ --certs-dir=/var/lib/cockroach/certs \ --host=localhost:26257 &>/dev/null; then log_message "ALERT: Cluster health check failed" exit 1 fi

Check individual node health

NODE_STATUS=$(/usr/local/bin/cockroach sql \ --certs-dir=/var/lib/cockroach/certs \ --host=localhost:26257 \ --execute="SELECT node_id, address, is_available FROM crdb_internal.gossip_nodes;" \ --format=csv) log_message "Node status check: $NODE_STATUS"

Check replication status

REPL_STATUS=$(/usr/local/bin/cockroach sql \ --certs-dir=/var/lib/cockroach/certs \ --host=localhost:26257 \ --execute="SELECT range_id, replicas FROM crdb_internal.ranges WHERE database_name='production' LIMIT 5;" \ --format=csv) log_message "Replication status: $REPL_STATUS" log_message "Health check completed successfully"
sudo chmod 755 /usr/local/bin/cockroach-health-check.sh
sudo chown cockroach:cockroach /usr/local/bin/cockroach-health-check.sh
Security Warning: Never use chmod 777 on certificate directories or database files. CockroachDB certificates should have 600 permissions for private keys and 644 for certificates. Directory permissions should be 700 for data directories to prevent unauthorized access.

Verify your setup

Test cluster connectivity, replication, and performance to ensure proper deployment.

# Check cluster status
sudo -u cockroach /usr/local/bin/cockroach node status \
  --certs-dir=/var/lib/cockroach/certs \
  --host=localhost:26257

Verify database connectivity

sudo -u cockroach /usr/local/bin/cockroach sql \ --certs-dir=/var/lib/cockroach/certs \ --host=localhost:26257 \ --execute="SHOW DATABASES;"

Test load balancer

curl -k https://localhost:8081/health

Check backup timer status

sudo systemctl status cockroach-backup.timer

Verify cluster replication

sudo -u cockroach /usr/local/bin/cockroach sql \ --certs-dir=/var/lib/cockroach/certs \ --host=localhost:26257 \ --execute="SELECT count(*) FROM crdb_internal.ranges;"

Test performance with sample data

sudo -u cockroach /usr/local/bin/cockroach workload init bank \ --certs-dir=/var/lib/cockroach/certs \ --host=localhost:26257

Common issues

Symptom Cause Fix
Node won't join cluster Certificate mismatch or network connectivity Verify certificates copied correctly and firewall rules allow port 26257
SSL connection errors Certificate permissions or missing client certificates Check cert permissions (644) and ensure client certificates exist
High memory usage Cache settings too aggressive Adjust --cache and --max-sql-memory parameters in systemd service
Backup script fails Insufficient disk space or permission issues Check /var/backups space and verify cockroach user ownership
HAProxy health checks failing Admin UI not accessible or SSL verification issues Verify port 8080 accessible and adjust check-ssl verify settings
Cluster split-brain Network partition between nodes Ensure odd number of nodes (3,5,7) and stable network connectivity

Next steps

Automated install script

Run this to automate the entire setup

#cockroachdb #distributed-sql #cockroachdb-cluster #horizontal-scaling #multi-region-database #cockroachdb-ssl #database-clustering #distributed-database

Need help?

Don't want to manage this yourself?

We handle infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.

Talk to an engineer