Implement automated ClickHouse backups with S3 storage and monitoring

Intermediate 45 min Apr 10, 2026
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Set up automated backup solutions for ClickHouse databases with S3 storage, retention policies, and monitoring alerts. This tutorial covers backup tool installation, S3 configuration, scheduling with systemd timers, and health monitoring.

Prerequisites

  • ClickHouse server installed and running
  • AWS S3 bucket with appropriate permissions
  • Root or sudo access

What this solves

ClickHouse databases require reliable backup strategies to protect against data loss and enable disaster recovery. Manual backups are error-prone and don't scale with production workloads. This tutorial implements automated ClickHouse backups using clickhouse-backup tool with S3 storage, configures retention policies, and sets up monitoring to ensure backup health and alert on failures.

Step-by-step installation

Update system packages

Start by updating your package manager and installing required dependencies for the backup tools.

sudo apt update && sudo apt upgrade -y
sudo apt install -y wget curl tar gzip awscli
sudo dnf update -y
sudo dnf install -y wget curl tar gzip awscli2

Install clickhouse-backup tool

Download and install the latest version of clickhouse-backup, which provides native ClickHouse backup functionality with S3 support.

BACKUP_VERSION=$(curl -s https://api.github.com/repos/Altinity/clickhouse-backup/releases/latest | grep '"tag_name"' | cut -d'"' -f4)
wget https://github.com/Altinity/clickhouse-backup/releases/download/${BACKUP_VERSION}/clickhouse-backup-linux-amd64.tar.gz
tar -xzf clickhouse-backup-linux-amd64.tar.gz
sudo mv clickhouse-backup/clickhouse-backup /usr/local/bin/
sudo chmod +x /usr/local/bin/clickhouse-backup

Create backup user and directories

Create a dedicated system user for backup operations and set up the required directory structure with proper permissions.

sudo useradd -r -s /bin/false -d /var/lib/clickhouse-backup clickhouse-backup
sudo mkdir -p /var/lib/clickhouse-backup/{config,logs,temp}
sudo mkdir -p /etc/clickhouse-backup
sudo chown -R clickhouse-backup:clickhouse-backup /var/lib/clickhouse-backup
sudo chmod 750 /var/lib/clickhouse-backup
sudo chmod 755 /etc/clickhouse-backup

Configure S3 storage bucket

Set up your S3 bucket and access credentials. Replace the values with your actual S3 configuration.

aws configure set aws_access_key_id YOUR_ACCESS_KEY_ID
aws configure set aws_secret_access_key YOUR_SECRET_ACCESS_KEY
aws configure set default.region us-east-1
aws s3 mb s3://clickhouse-backups-example --region us-east-1

Create backup configuration

Configure clickhouse-backup with S3 storage settings, retention policies, and ClickHouse connection parameters.

general:
  remote_storage: s3
  max_file_size: 1073741824
  disable_progress_bar: true
  backups_to_keep_local: 3
  backups_to_keep_remote: 30
  log_level: info
  allow_empty_backups: false

clickhouse:
  username: default
  password: ""
  host: localhost
  port: 9000
  data_path: /var/lib/clickhouse
  skip_tables:
    - system.*
    - information_schema.*
  timeout: 5m
  connection_timeout: 10s
  log_sql_queries: true

s3:
  access_key: YOUR_ACCESS_KEY_ID
  secret_key: YOUR_SECRET_ACCESS_KEY
  bucket: clickhouse-backups-example
  endpoint: ""
  region: us-east-1
  acl: private
  assume_role_arn: ""
  force_path_style: false
  path: clickhouse-backups/
  disable_ssl: false
  compression_level: 1
  compression_format: gzip
  sse: AES256
  disable_cert_verification: false
  storage_class: STANDARD
  concurrency: 1
  part_size: 134217728

azblob:
  endpoint_suffix: core.windows.net
  account_name: ""
  account_key: ""
  sas_token: ""
  use_managed_identity: false
  container: ""
  path: ""
  compression_level: 1
  compression_format: gzip
  timeout: 4h

Set configuration file permissions

Secure the configuration file containing S3 credentials with restrictive permissions.

sudo chown clickhouse-backup:clickhouse-backup /etc/clickhouse-backup/config.yml
sudo chmod 640 /etc/clickhouse-backup/config.yml

Create backup script

Create a comprehensive backup script that handles both full and incremental backups with error handling and logging.

#!/bin/bash

set -euo pipefail

Configuration

CONFIG_FILE="/etc/clickhouse-backup/config.yml" LOG_FILE="/var/lib/clickhouse-backup/logs/backup-$(date +%Y%m%d-%H%M%S).log" BACKUP_NAME="backup-$(date +%Y%m%d-%H%M%S)" RETENTION_DAYS=30 MAX_LOG_FILES=10

Ensure log directory exists

sudo -u clickhouse-backup mkdir -p /var/lib/clickhouse-backup/logs

Function to log messages

log_message() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | sudo -u clickhouse-backup tee -a "$LOG_FILE" }

Function to cleanup old log files

cleanup_logs() { find /var/lib/clickhouse-backup/logs -name "backup-*.log" -type f -mtime +7 -delete 2>/dev/null || true find /var/lib/clickhouse-backup/logs -name "backup-*.log" -type f | sort -r | tail -n +$((MAX_LOG_FILES+1)) | xargs -r rm -f }

Function to send notification

send_notification() { local status="$1" local message="$2" # Add your notification logic here (email, Slack, etc.) if command -v mail >/dev/null 2>&1; then echo "$message" | mail -s "ClickHouse Backup $status" admin@example.com fi log_message "$status: $message" }

Trap to handle errors

trap 'send_notification "FAILED" "Backup failed at $(date). Check log: $LOG_FILE"' ERR log_message "Starting ClickHouse backup process"

Test ClickHouse connection

log_message "Testing ClickHouse connection" sudo -u clickhouse-backup /usr/local/bin/clickhouse-backup --config "$CONFIG_FILE" list local >/dev/null

Create backup

log_message "Creating backup: $BACKUP_NAME" sudo -u clickhouse-backup /usr/local/bin/clickhouse-backup --config "$CONFIG_FILE" create "$BACKUP_NAME"

Upload to S3

log_message "Uploading backup to S3" sudo -u clickhouse-backup /usr/local/bin/clickhouse-backup --config "$CONFIG_FILE" upload "$BACKUP_NAME"

Cleanup old local backups

log_message "Cleaning up old local backups" sudo -u clickhouse-backup /usr/local/bin/clickhouse-backup --config "$CONFIG_FILE" delete local --keep-last 3

Cleanup old remote backups

log_message "Cleaning up old remote backups" sudo -u clickhouse-backup /usr/local/bin/clickhouse-backup --config "$CONFIG_FILE" delete remote --keep-last "$RETENTION_DAYS"

Cleanup old log files

cleanup_logs log_message "Backup completed successfully: $BACKUP_NAME" send_notification "SUCCESS" "ClickHouse backup completed successfully: $BACKUP_NAME" exit 0

Make backup script executable

Set appropriate permissions on the backup script to allow execution by the backup user.

sudo chmod +x /usr/local/bin/clickhouse-backup.sh
sudo chown root:clickhouse-backup /usr/local/bin/clickhouse-backup.sh

Configure systemd service

Create a systemd service unit for running backup operations with proper isolation and resource limits.

[Unit]
Description=ClickHouse Database Backup
After=clickhouse-server.service
Requires=clickhouse-server.service

[Service]
Type=oneshot
User=root
Group=clickhouse-backup
ExecStart=/usr/local/bin/clickhouse-backup.sh
TimeoutStartSec=3600
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
NoNewPrivileges=true
ReadWritePaths=/var/lib/clickhouse-backup /tmp
StandardOutput=journal
StandardError=journal
SyslogIdentifier=clickhouse-backup

Create systemd timer

Set up a systemd timer to run backups automatically on a daily schedule.

[Unit]
Description=Run ClickHouse backup daily
Requires=clickhouse-backup.service

[Timer]
OnCalendar=daily
RandomizedDelaySec=1800
Persistent=true

[Install]
WantedBy=timers.target

Enable and start the backup timer

Enable the systemd timer to start automatically and begin the scheduled backup process.

sudo systemctl daemon-reload
sudo systemctl enable clickhouse-backup.timer
sudo systemctl start clickhouse-backup.timer
sudo systemctl status clickhouse-backup.timer

Create backup monitoring script

Implement monitoring to check backup status and alert on failures or missing backups.

#!/bin/bash

set -euo pipefail

CONFIG_FILE="/etc/clickhouse-backup/config.yml"
MAX_BACKUP_AGE_HOURS=36
ALERT_EMAIL="admin@example.com"
LOG_FILE="/var/lib/clickhouse-backup/logs/health-check.log"

Function to log messages

log_message() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE" }

Function to send alert

send_alert() { local message="$1" log_message "ALERT: $message" if command -v mail >/dev/null 2>&1; then echo "$message" | mail -s "ClickHouse Backup Alert" "$ALERT_EMAIL" fi }

Check if backup service exists and is enabled

if ! systemctl is-enabled clickhouse-backup.timer >/dev/null 2>&1; then send_alert "Backup timer is not enabled" exit 1 fi

Check recent backup status

log_message "Checking backup status"

Get list of remote backups

if ! BACKUP_LIST=$(sudo -u clickhouse-backup /usr/local/bin/clickhouse-backup --config "$CONFIG_FILE" list remote 2>/dev/null); then send_alert "Failed to retrieve backup list from S3" exit 1 fi

Check if we have any backups

if [ -z "$BACKUP_LIST" ] || [ "$BACKUP_LIST" = "" ]; then send_alert "No backups found in S3 storage" exit 1 fi

Get the most recent backup timestamp

LATEST_BACKUP=$(echo "$BACKUP_LIST" | grep -E "backup-[0-9]{8}-[0-9]{6}" | sort -r | head -1 | awk '{print $1}') if [ -z "$LATEST_BACKUP" ]; then send_alert "Could not determine latest backup timestamp" exit 1 fi

Extract timestamp from backup name

BACKUP_TIMESTAMP=$(echo "$LATEST_BACKUP" | grep -oE "[0-9]{8}-[0-9]{6}") BACKUP_DATE=$(echo "$BACKUP_TIMESTAMP" | cut -c1-8) BACKUP_TIME=$(echo "$BACKUP_TIMESTAMP" | cut -c10-15)

Convert to epoch time

BACKUP_EPOCH=$(date -d "${BACKUP_DATE:0:4}-${BACKUP_DATE:4:2}-${BACKUP_DATE:6:2} ${BACKUP_TIME:0:2}:${BACKUP_TIME:2:2}:${BACKUP_TIME:4:2}" +%s) CURRENT_EPOCH=$(date +%s) AGE_HOURS=$(( (CURRENT_EPOCH - BACKUP_EPOCH) / 3600 )) log_message "Latest backup: $LATEST_BACKUP (${AGE_HOURS}h old)"

Check if backup is too old

if [ "$AGE_HOURS" -gt "$MAX_BACKUP_AGE_HOURS" ]; then send_alert "Latest backup is ${AGE_HOURS} hours old (threshold: ${MAX_BACKUP_AGE_HOURS}h)" exit 1 fi

Check recent systemd service status

if systemctl is-active clickhouse-backup.service >/dev/null 2>&1; then log_message "Backup service is currently running" else # Check if last run was successful if ! systemctl status clickhouse-backup.service | grep -q "succeeded"; then LAST_STATUS=$(systemctl show clickhouse-backup.service -p ExecMainStatus --value) if [ "$LAST_STATUS" != "0" ]; then send_alert "Last backup service run failed with status: $LAST_STATUS" exit 1 fi fi fi

Check disk space for local backups

BACKUP_DIR_USAGE=$(df /var/lib/clickhouse-backup | tail -1 | awk '{print $5}' | sed 's/%//') if [ "$BACKUP_DIR_USAGE" -gt 85 ]; then send_alert "Backup directory disk usage is ${BACKUP_DIR_USAGE}% (threshold: 85%)" fi log_message "Backup health check passed" exit 0

Make health check script executable

Set permissions on the monitoring script and create a systemd timer for regular health checks.

sudo chmod +x /usr/local/bin/check-backup-health.sh
sudo chown root:clickhouse-backup /usr/local/bin/check-backup-health.sh

Create health check systemd service

Configure a systemd service for running backup health checks with proper resource constraints.

[Unit]
Description=ClickHouse Backup Health Check
After=network.target

[Service]
Type=oneshot
User=root
Group=clickhouse-backup
ExecStart=/usr/local/bin/check-backup-health.sh
TimeoutStartSec=300
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
NoNewPrivileges=true
ReadWritePaths=/var/lib/clickhouse-backup
StandardOutput=journal
StandardError=journal
SyslogIdentifier=clickhouse-backup-health

Create health check timer

Set up a systemd timer to run health checks every 4 hours to ensure backup monitoring.

[Unit]
Description=Run ClickHouse backup health check
Requires=clickhouse-backup-health.service

[Timer]
OnCalendar=--* 00,06,12,18:00:00
RandomizedDelaySec=300
Persistent=true

[Install]
WantedBy=timers.target

Enable health check timer

Start the health monitoring timer to begin automated backup status monitoring.

sudo systemctl daemon-reload
sudo systemctl enable clickhouse-backup-health.timer
sudo systemctl start clickhouse-backup-health.timer
sudo systemctl status clickhouse-backup-health.timer

Configure log rotation

Set up logrotate to manage backup and health check log files to prevent disk space issues.

/var/lib/clickhouse-backup/logs/*.log {
    daily
    missingok
    rotate 14
    compress
    delaycompress
    notifempty
    create 644 clickhouse-backup clickhouse-backup
    postrotate
        /bin/systemctl reload rsyslog > /dev/null 2>&1 || true
    endscript
}

Verify your setup

Test the backup system to ensure all components are working correctly.

# Test manual backup creation
sudo systemctl start clickhouse-backup.service
sudo journalctl -u clickhouse-backup.service -f

Check backup timer status

sudo systemctl list-timers clickhouse-backup.timer

Verify S3 uploads

sudo -u clickhouse-backup /usr/local/bin/clickhouse-backup --config /etc/clickhouse-backup/config.yml list remote

Test health check

sudo systemctl start clickhouse-backup-health.service sudo journalctl -u clickhouse-backup-health.service -n 50

Check backup logs

sudo ls -la /var/lib/clickhouse-backup/logs/ sudo tail -f /var/lib/clickhouse-backup/logs/health-check.log

Common issues

SymptomCauseFix
Permission denied errorsIncorrect file ownershipsudo chown -R clickhouse-backup:clickhouse-backup /var/lib/clickhouse-backup
S3 upload failsInvalid credentials or bucket permissionsVerify AWS credentials and bucket policy with aws s3 ls s3://your-bucket
Backup service failsClickHouse connection timeoutIncrease timeout in config.yml and verify ClickHouse is running
Timer not runningSystemd timer not enabledsudo systemctl enable clickhouse-backup.timer && sudo systemctl start clickhouse-backup.timer
Health check alertsOld backup retention policyAdjust MAX_BACKUP_AGE_HOURS in health check script
Disk space issuesLocal backup accumulationReduce backups_to_keep_local in config.yml

Next steps

Automated install script

Run this to automate the entire setup

#clickhouse #backup #s3 #automation #monitoring

Need help?

Don't want to manage this yourself?

We handle infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.

Talk to an engineer