Cassandra Backup Automation with nodetool

Automate Apache Cassandra backups using nodetool snapshots, systemd timers, and retention policies. Configure monitoring and alerting for production-grade backup management with automated cleanup and verification.

Prerequisites

Root or sudo access
At least 4GB RAM
50GB+ disk space for backups
Basic familiarity with systemd timers

What this solves

Manual Cassandra backups are error-prone and often forgotten until disaster strikes. This tutorial sets up automated snapshot backups using nodetool, systemd timers for scheduling, and retention policies for storage management. You'll also configure monitoring to track backup success and get alerts when backups fail.

Step-by-step installation

Update system packages

Start by updating your package manager to ensure you get the latest versions of dependencies.

sudo apt update && sudo apt upgrade -y

sudo dnf update -y

Install Apache Cassandra

Install Cassandra and Java runtime if not already present. This includes the nodetool command required for backup operations.

wget -q -O - https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -
echo "deb https://debian.cassandra.apache.org 40x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
sudo apt update
sudo apt install -y cassandra openjdk-11-jdk

sudo dnf install -y java-11-openjdk
sudo tee /etc/yum.repos.d/cassandra.repo << EOF
[cassandra]
name=Apache Cassandra
baseurl=https://redhat.cassandra.apache.org/40x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://www.apache.org/dist/cassandra/KEYS
EOF
sudo dnf install -y cassandra

Configure Cassandra data directories

Set up proper directories for Cassandra data and backups with correct permissions. The cassandra user needs write access to backup locations.

sudo mkdir -p /var/lib/cassandra/backups
sudo mkdir -p /opt/cassandra/scripts
sudo chown -R cassandra:cassandra /var/lib/cassandra
sudo chmod 755 /var/lib/cassandra/backups

Enable Cassandra service

Start Cassandra and enable it to run on boot. Verify the service is running correctly before proceeding.

sudo systemctl enable --now cassandra
sudo systemctl status cassandra
nodetool status

Create backup script

Create a comprehensive backup script that handles snapshots, cleanup, and logging. This script creates consistent snapshots across all keyspaces.

#!/bin/bash

Cassandra backup script with nodetool
Usage: ./cassandra-backup.sh

set -euo pipefail

Configuration
BACKUP_DIR="/var/lib/cassandra/backups"
LOG_FILE="/var/log/cassandra/backup.log"
RETENTION_DAYS=7
TIMESTAMP=$(date +"%Y%m%d_%H%M%S")
SNAPSHOT_NAME="backup_${TIMESTAMP}"

Logging function
log() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE"
}

Create log directory
sudo mkdir -p $(dirname "$LOG_FILE")
sudo chown cassandra:cassandra $(dirname "$LOG_FILE")

log "Starting Cassandra backup: $SNAPSHOT_NAME"

Check if Cassandra is running
if ! nodetool status > /dev/null 2>&1; then
    log "ERROR: Cassandra is not running or nodetool failed"
    exit 1
fi

Create snapshot for all keyspaces
log "Creating snapshot: $SNAPSHOT_NAME"
if nodetool snapshot -t "$SNAPSHOT_NAME"; then
    log "Snapshot created successfully: $SNAPSHOT_NAME"
else
    log "ERROR: Failed to create snapshot"
    exit 1
fi

Copy snapshots to backup directory
log "Copying snapshots to backup directory"
BACKUP_PATH="$BACKUP_DIR/$SNAPSHOT_NAME"
mkdir -p "$BACKUP_PATH"

Find and copy all snapshot files
find /var/lib/cassandra/data -name "$SNAPSHOT_NAME" -type d | while read snapshot_dir; do
    # Extract keyspace and table info from path
    relative_path=$(echo "$snapshot_dir" | sed 's|/var/lib/cassandra/data/||')
    target_dir="$BACKUP_PATH/$relative_path"
    
    mkdir -p "$(dirname "$target_dir")"
    cp -r "$snapshot_dir" "$(dirname "$target_dir")/"
    log "Copied: $relative_path"
done

Create backup metadata
cat > "$BACKUP_PATH/backup_info.txt" << EOF
Backup Name: $SNAPSHOT_NAME
Backup Date: $(date)
Cassandra Version: $(nodetool version)
Node Status: $(nodetool status)
Keyspaces: $(nodetool describecluster | grep -A 10 "Schema versions")
EOF

Compress backup
log "Compressing backup"
cd "$BACKUP_DIR"
tar -czf "${SNAPSHOT_NAME}.tar.gz" "$SNAPSHOT_NAME"/
rm -rf "$SNAPSHOT_NAME"

Calculate backup size
BACKUP_SIZE=$(du -h "${SNAPSHOT_NAME}.tar.gz" | cut -f1)
log "Backup compressed: ${SNAPSHOT_NAME}.tar.gz ($BACKUP_SIZE)"

Clean old snapshots from Cassandra
log "Cleaning old snapshots from Cassandra data directory"
nodetool clearsnapshot

Clean old backup files
log "Cleaning backups older than $RETENTION_DAYS days"
find "$BACKUP_DIR" -name "backup_*.tar.gz" -mtime +$RETENTION_DAYS -delete

log "Backup completed successfully: ${SNAPSHOT_NAME}.tar.gz"

Verify backup integrity
if tar -tzf "${SNAPSHOT_NAME}.tar.gz" > /dev/null 2>&1; then
    log "Backup integrity verified"
    echo "SUCCESS" > "$BACKUP_DIR/.last_backup_status"
else
    log "ERROR: Backup integrity check failed"
    echo "FAILED" > "$BACKUP_DIR/.last_backup_status"
    exit 1
fi

exit 0

Set script permissions

Make the backup script executable and ensure proper ownership for the cassandra user.

sudo chmod +x /opt/cassandra/scripts/cassandra-backup.sh
sudo chown cassandra:cassandra /opt/cassandra/scripts/cassandra-backup.sh

Create systemd service unit

Create a systemd service to run the backup script with proper user context and logging.

[Unit]
Description=Cassandra Backup Service
Requires=cassandra.service
After=cassandra.service

[Service]
Type=oneshot
User=cassandra
Group=cassandra
ExecStart=/opt/cassandra/scripts/cassandra-backup.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=cassandra-backup

Security settings
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/cassandra /var/log/cassandra

Resource limits
MemoryLimit=512M
TimeoutSec=3600

Create systemd timer unit

Configure a systemd timer to run backups automatically. This schedules daily backups at 2 AM with randomized delay to avoid resource conflicts.

[Unit]
Description=Run Cassandra Backup Daily
Requires=cassandra-backup.service

[Timer]
Run daily at 2 AM with 30-minute random delay
OnCalendar=02:00
RandomizedDelaySec=1800
Persistent=true

[Install]
WantedBy=timers.target

Enable backup automation

Reload systemd configuration and enable the backup timer to start on boot.

sudo systemctl daemon-reload
sudo systemctl enable cassandra-backup.timer
sudo systemctl start cassandra-backup.timer

Create monitoring script

Create a monitoring script to check backup status and generate alerts when backups fail.

#!/bin/bash

Cassandra backup health check script
Returns 0 if backup is healthy, 1 if issues found

set -euo pipefail

BACKUP_DIR="/var/lib/cassandra/backups"
LOG_FILE="/var/log/cassandra/backup-health.log"
MAX_AGE_HOURS=25  # Allow 1 hour past daily schedule

log() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE"
}

Check if backup directory exists
if [[ ! -d "$BACKUP_DIR" ]]; then
    log "ERROR: Backup directory does not exist: $BACKUP_DIR"
    exit 1
fi

Check last backup status
if [[ -f "$BACKUP_DIR/.last_backup_status" ]]; then
    LAST_STATUS=$(cat "$BACKUP_DIR/.last_backup_status")
    if [[ "$LAST_STATUS" != "SUCCESS" ]]; then
        log "ERROR: Last backup failed with status: $LAST_STATUS"
        exit 1
    fi
else
    log "WARNING: No backup status file found"
fi

Check for recent backup files
LATEST_BACKUP=$(find "$BACKUP_DIR" -name "backup_*.tar.gz" -mtime -1 | head -1)
if [[ -z "$LATEST_BACKUP" ]]; then
    log "ERROR: No backup files found within last 24 hours"
    exit 1
fi

Check backup file age
FILE_AGE_HOURS=$(( ($(date +%s) - $(stat -c %Y "$LATEST_BACKUP")) / 3600 ))
if [[ $FILE_AGE_HOURS -gt $MAX_AGE_HOURS ]]; then
    log "ERROR: Latest backup is too old: $FILE_AGE_HOURS hours (max: $MAX_AGE_HOURS)"
    exit 1
fi

Check backup file integrity
if ! tar -tzf "$LATEST_BACKUP" > /dev/null 2>&1; then
    log "ERROR: Latest backup file is corrupted: $LATEST_BACKUP"
    exit 1
fi

Check backup size (should be > 1KB)
FILE_SIZE=$(stat -c%s "$LATEST_BACKUP")
if [[ $FILE_SIZE -lt 1024 ]]; then
    log "ERROR: Backup file is too small: $FILE_SIZE bytes"
    exit 1
fi

Check Cassandra service
if ! systemctl is-active --quiet cassandra; then
    log "ERROR: Cassandra service is not running"
    exit 1
fi

Check disk space (warn if backup partition < 10% free)
BACKUP_PARTITION=$(df "$BACKUP_DIR" | awk 'NR==2 {print $5}' | sed 's/%//')
if [[ $BACKUP_PARTITION -gt 90 ]]; then
    log "WARNING: Backup partition is ${BACKUP_PARTITION}% full"
fi

log "Backup health check passed - Latest backup: $(basename "$LATEST_BACKUP") (${FILE_AGE_HOURS}h old)"
exit 0

Set monitoring script permissions

Make the health check script executable and set correct ownership.

sudo chmod +x /opt/cassandra/scripts/check-backup-health.sh
sudo chown cassandra:cassandra /opt/cassandra/scripts/check-backup-health.sh

Configure log rotation

Set up log rotation for backup logs to prevent disk space issues.

/var/log/cassandra/backup*.log {
    daily
    rotate 30
    compress
    delaycompress
    missingok
    notifempty
    create 0644 cassandra cassandra
    postrotate
        # No need to restart services for these logs
    endscript
}

Test backup process

Run a manual backup to verify everything works correctly before relying on the automated schedule.

sudo -u cassandra /opt/cassandra/scripts/cassandra-backup.sh

Set up monitoring and alerting

Create health check timer

Set up regular health checks to monitor backup status and alert on failures.

[Unit]
Description=Cassandra Backup Health Check
After=network.target

[Service]
Type=oneshot
User=cassandra
Group=cassandra
ExecStart=/opt/cassandra/scripts/check-backup-health.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=cassandra-backup-health

Create health check timer

Configure the health check to run every 6 hours to catch backup issues quickly.

[Unit]
Description=Run Cassandra Backup Health Check
Requires=cassandra-backup-health.service

[Timer]
OnBootSec=30min
OnUnitActiveSec=6h
Persistent=true

[Install]
WantedBy=timers.target

Enable health monitoring

Start the health check timer to begin monitoring your backup system.

sudo systemctl daemon-reload
sudo systemctl enable cassandra-backup-health.timer
sudo systemctl start cassandra-backup-health.timer

Configure email alerts

Install and configure mail service for backup failure notifications. This uses postfix for local mail delivery.

sudo apt install -y postfix mailutils
Configure as "Local only" during setup

sudo dnf install -y postfix mailx
sudo systemctl enable --now postfix

Create alert service

Create a service that sends email alerts when backup health checks fail.

[Unit]
Description=Cassandra Backup Health Alert for %i
After=network.target

[Service]
Type=oneshot
ExecStart=/bin/bash -c 'echo "Cassandra backup health check failed on $(hostname) at $(date). Check /var/log/cassandra/backup-health.log for details." | mail -s "ALERT: Cassandra Backup Failed on $(hostname)" root'
StandardOutput=journal
StandardError=journal

Configure systemd failure alerts

Set up the health check service to trigger email alerts on failure using systemd's OnFailure directive.

sudo mkdir -p /etc/systemd/system/cassandra-backup-health.service.d
sudo tee /etc/systemd/system/cassandra-backup-health.service.d/alerts.conf << EOF
[Unit]
OnFailure=cassandra-backup-health@%n.service
EOF

Reload and test alerting

Reload systemd configuration and test the alert system.

sudo systemctl daemon-reload
Test the health check manually
sudo systemctl start cassandra-backup-health.service

Configure retention and cleanup policies

Create cleanup script

Create an additional cleanup script for advanced retention policies based on backup age and disk usage.

#!/bin/bash

Advanced backup cleanup script
Implements tiered retention policy

set -euo pipefail

BACKUP_DIR="/var/lib/cassandra/backups"
LOG_FILE="/var/log/cassandra/cleanup.log"

Retention policy
DAILY_KEEP=7      # Keep daily backups for 7 days
WEEKLY_KEEP=4     # Keep weekly backups for 4 weeks
MONTHLY_KEEP=12   # Keep monthly backups for 12 months
DISK_THRESHOLD=85 # Clean more aggressively if disk usage > 85%

log() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE"
}

log "Starting backup cleanup"

Check current disk usage
DISK_USAGE=$(df "$BACKUP_DIR" | awk 'NR==2 {print $5}' | sed 's/%//')
log "Current disk usage: ${DISK_USAGE}%"

cd "$BACKUP_DIR"

Remove backups older than daily retention
log "Removing daily backups older than $DAILY_KEEP days"
find . -name "backup_*.tar.gz" -mtime +$DAILY_KEEP -delete

If disk usage is high, be more aggressive
if [[ $DISK_USAGE -gt $DISK_THRESHOLD ]]; then
    log "Disk usage high (${DISK_USAGE}%), applying aggressive cleanup"
    # Keep only last 3 days
    find . -name "backup_*.tar.gz" -mtime +3 -delete
fi

Count remaining backups
REMAINING=$(find . -name "backup_*.tar.gz" | wc -l)
log "Cleanup completed. Remaining backups: $REMAINING"

Calculate total backup space
TOTAL_SIZE=$(du -sh . | cut -f1)
log "Total backup space used: $TOTAL_SIZE"

exit 0

Set cleanup script permissions

Make the cleanup script executable and set proper ownership.

sudo chmod +x /opt/cassandra/scripts/cleanup-old-backups.sh
sudo chown cassandra:cassandra /opt/cassandra/scripts/cleanup-old-backups.sh

Create cleanup timer

Schedule the cleanup script to run weekly to maintain storage efficiency.

[Unit]
Description=Cassandra Backup Cleanup Service
After=network.target

[Service]
Type=oneshot
User=cassandra
Group=cassandra
ExecStart=/opt/cassandra/scripts/cleanup-old-backups.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=cassandra-cleanup

Create cleanup timer unit

Schedule cleanup to run weekly on Sunday mornings.

[Unit]
Description=Run Cassandra Backup Cleanup Weekly
Requires=cassandra-cleanup.service

[Timer]
OnCalendar=Sun 03:00
RandomizedDelaySec=3600
Persistent=true

[Install]
WantedBy=timers.target

Enable cleanup automation

Enable the cleanup timer to maintain your backup retention policy automatically.

sudo systemctl daemon-reload
sudo systemctl enable cassandra-cleanup.timer
sudo systemctl start cassandra-cleanup.timer

Verify your setup

Run these commands to verify your backup automation is working correctly.

# Check backup timer status
sudo systemctl status cassandra-backup.timer

Check health monitoring timer
sudo systemctl status cassandra-backup-health.timer

Check cleanup timer
sudo systemctl status cassandra-cleanup.timer

List recent timer executions
sudo systemctl list-timers cassandra-*

Check backup directory
ls -la /var/lib/cassandra/backups/

Run manual health check
sudo -u cassandra /opt/cassandra/scripts/check-backup-health.sh

Check Cassandra cluster status
nodetool status

View backup logs
sudo tail -f /var/log/cassandra/backup.log

Common issues

Symptom	Cause	Fix
Backup script fails with permission denied	Incorrect file ownership or permissions	`sudo chown -R cassandra:cassandra /var/lib/cassandra /opt/cassandra`
nodetool snapshot command fails	Cassandra service not running or JMX connectivity issues	Check `sudo systemctl status cassandra` and firewall rules
Backup files are empty or very small	No data in keyspaces or snapshot creation failed	Check Cassandra logs and verify keyspaces exist with `nodetool describecluster`
Health check reports backup is too old	Backup timer not running or backup script failing silently	Check timer status and backup logs for errors
Email alerts not working	Postfix not configured or root mail not set up	Configure postfix and set up root mail forwarding in `/etc/aliases`
Disk space issues with backups	Retention policy not working or cleanup timer disabled	Run cleanup manually and verify timer is enabled
Systemd timers not triggering	Timer not enabled or systemd clock issues	`sudo systemctl enable timer-name.timer` and check system time

Note: Always test backup restoration procedures regularly. A backup is only as good as your ability to restore from it. Consider setting up a separate test environment to practice restoration.

Next steps

Monitor Apache Cassandra cluster with Prometheus and Grafana dashboards for comprehensive cluster observability
Configure backup monitoring with Prometheus and Grafana for automated infrastructure oversight to integrate backup metrics
Set up centralized logging with rsyslog and logrotate for security events to aggregate Cassandra and backup logs
Configure Cassandra cross-datacenter replication for disaster recovery for enhanced data protection
Implement Cassandra backup encryption with GPG for secure storage to add encryption to your backups

Running this in production?

Ready for production scale? Setting up Cassandra backup automation is straightforward. Keeping it monitored, tuned for performance, and integrated with disaster recovery across environments is the harder part. See how we run infrastructure like this for European SaaS and fintech teams.

Automated install script

Run this to automate the entire setup

install.sh

#!/usr/bin/env bash

set -euo pipefail

# Cassandra Backup Automation Setup Script
# Auto-detects distribution and installs Cassandra backup automation

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color

# Logging functions
log_info() { echo -e "${GREEN}[INFO]${NC} $1"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; }
log_error() { echo -e "${RED}[ERROR]${NC} $1"; }

# Cleanup function for rollback on error
cleanup() {
    log_error "Installation failed. Cleaning up..."
    systemctl stop cassandra-backup.timer 2>/dev/null || true
    systemctl disable cassandra-backup.timer 2>/dev/null || true
    rm -f /etc/systemd/system/cassandra-backup.{service,timer}
    rm -rf /opt/cassandra/scripts
    rm -rf /var/backups/cassandra
    systemctl daemon-reload
}
trap cleanup ERR

# Check if running as root or with sudo
if [[ $EUID -ne 0 ]]; then
    log_error "This script must be run as root or with sudo"
    exit 1
fi

# Auto-detect distribution
if [ -f /etc/os-release ]; then
    . /etc/os-release
    case "$ID" in
        ubuntu|debian) 
            PKG_MGR="apt"
            PKG_UPDATE="apt update && apt upgrade -y"
            PKG_INSTALL="apt install -y"
            JAVA_PKG="openjdk-11-jdk"
            ;;
        almalinux|rocky|centos|rhel|ol|fedora)
            PKG_MGR="dnf"
            PKG_UPDATE="dnf update -y"
            PKG_INSTALL="dnf install -y"
            JAVA_PKG="java-11-openjdk"
            ;;
        amzn)
            PKG_MGR="yum"
            PKG_UPDATE="yum update -y"
            PKG_INSTALL="yum install -y"
            JAVA_PKG="java-11-openjdk"
            ;;
        *)
            log_error "Unsupported distribution: $ID"
            exit 1
            ;;
    esac
else
    log_error "Cannot detect distribution. /etc/os-release not found"
    exit 1
fi

log_info "Detected distribution: $ID"

echo "[1/8] Updating system packages..."
$PKG_UPDATE

echo "[2/8] Installing Apache Cassandra..."
if [[ "$PKG_MGR" == "apt" ]]; then
    wget -q -O - https://www.apache.org/dist/cassandra/KEYS | apt-key add -
    echo "deb https://debian.cassandra.apache.org 40x main" > /etc/apt/sources.list.d/cassandra.sources.list
    apt update
    $PKG_INSTALL cassandra $JAVA_PKG
else
    $PKG_INSTALL $JAVA_PKG wget
    cat > /etc/yum.repos.d/cassandra.repo << 'EOF'
[cassandra]
name=Apache Cassandra
baseurl=https://redhat.cassandra.apache.org/40x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://www.apache.org/dist/cassandra/KEYS
EOF
    $PKG_INSTALL cassandra
fi

echo "[3/8] Starting and enabling Cassandra service..."
systemctl enable cassandra
systemctl start cassandra

# Wait for Cassandra to be ready
echo "Waiting for Cassandra to start..."
for i in {1..30}; do
    if nodetool status > /dev/null 2>&1; then
        break
    fi
    sleep 5
    if [ $i -eq 30 ]; then
        log_error "Cassandra failed to start within 150 seconds"
        exit 1
    fi
done

echo "[4/8] Creating backup directories and scripts..."
mkdir -p /opt/cassandra/scripts
mkdir -p /var/backups/cassandra
mkdir -p /var/log/cassandra

cat > /opt/cassandra/scripts/cassandra-backup.sh << 'EOF'
#!/usr/bin/env bash

set -euo pipefail

SNAPSHOT_NAME="backup_$(date +%Y%m%d_%H%M%S)"
BACKUP_DIR="/var/backups/cassandra"
RETENTION_DAYS=${RETENTION_DAYS:-7}
LOG_FILE="/var/log/cassandra/backup.log"

log() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE"
}

log "Starting Cassandra backup process"

if ! nodetool status > /dev/null 2>&1; then
    log "ERROR: Cassandra is not running or nodetool failed"
    exit 1
fi

log "Creating snapshot: $SNAPSHOT_NAME"
if nodetool snapshot -t "$SNAPSHOT_NAME"; then
    log "Snapshot created successfully: $SNAPSHOT_NAME"
else
    log "ERROR: Failed to create snapshot"
    exit 1
fi

log "Copying snapshots to backup directory"
BACKUP_PATH="$BACKUP_DIR/$SNAPSHOT_NAME"
mkdir -p "$BACKUP_PATH"

find /var/lib/cassandra/data -name "$SNAPSHOT_NAME" -type d | while read snapshot_dir; do
    relative_path=$(echo "$snapshot_dir" | sed 's|/var/lib/cassandra/data/||')
    target_dir="$BACKUP_PATH/$relative_path"
    
    mkdir -p "$(dirname "$target_dir")"
    cp -r "$snapshot_dir" "$(dirname "$target_dir")/"
    log "Copied: $relative_path"
done

cat > "$BACKUP_PATH/backup_info.txt" << BACKUP_INFO
Backup Name: $SNAPSHOT_NAME
Backup Date: $(date)
Cassandra Version: $(nodetool version)
Keyspaces: $(nodetool describecluster | grep -A 1 "Schema versions")
BACKUP_INFO

log "Cleaning up old snapshots"
nodetool clearsnapshot

log "Removing backups older than $RETENTION_DAYS days"
find "$BACKUP_DIR" -maxdepth 1 -type d -name "backup_*" -mtime +$RETENTION_DAYS -exec rm -rf {} \;

if find "$BACKUP_PATH" -name "*.db" | head -1 > /dev/null 2>&1; then
    log "Backup integrity verified"
    echo "SUCCESS" > "$BACKUP_DIR/.last_backup_status"
else
    log "ERROR: Backup integrity check failed"
    echo "FAILED" > "$BACKUP_DIR/.last_backup_status"
    exit 1
fi

log "Backup completed successfully: $SNAPSHOT_NAME"
exit 0
EOF

chmod 755 /opt/cassandra/scripts/cassandra-backup.sh
chown cassandra:cassandra /opt/cassandra/scripts/cassandra-backup.sh
chown -R cassandra:cassandra /var/backups/cassandra

echo "[5/8] Creating systemd service unit..."
cat > /etc/systemd/system/cassandra-backup.service << 'EOF'
[Unit]
Description=Cassandra Backup Service
Requires=cassandra.service
After=cassandra.service

[Service]
Type=oneshot
User=cassandra
Group=cassandra
ExecStart=/opt/cassandra/scripts/cassandra-backup.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=cassandra-backup

# Security settings
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/cassandra /var/log/cassandra /var/backups/cassandra

# Resource limits
MemoryLimit=512M
TimeoutSec=3600
EOF

echo "[6/8] Creating systemd timer unit..."
cat > /etc/systemd/system/cassandra-backup.timer << 'EOF'
[Unit]
Description=Run Cassandra Backup Daily
Requires=cassandra-backup.service

[Timer]
# Run daily at 2 AM with 30-minute random delay
OnCalendar=02:00
RandomizedDelaySec=1800
Persistent=true

[Install]
WantedBy=timers.target
EOF

echo "[7/8] Enabling backup automation..."
systemctl daemon-reload
systemctl enable cassandra-backup.timer
systemctl start cassandra-backup.timer

echo "[8/8] Creating monitoring script..."
cat > /opt/cassandra/scripts/check-backup-health.sh << 'EOF'
#!/usr/bin/env bash

BACKUP_DIR="/var/backups/cassandra"
STATUS_FILE="$BACKUP_DIR/.last_backup_status"
MAX_AGE_HOURS=25

if [ ! -f "$STATUS_FILE" ]; then
    echo "CRITICAL: No backup status file found"
    exit 2
fi

STATUS=$(cat "$STATUS_FILE")
LATEST_BACKUP=$(find "$BACKUP_DIR" -maxdepth 1 -type d -name "backup_*" | sort | tail -1)

if [ -z "$LATEST_BACKUP" ]; then
    echo "CRITICAL: No backup directories found"
    exit 2
fi

BACKUP_AGE=$(find "$LATEST_BACKUP" -maxdepth 0 -mmin +$((MAX_AGE_HOURS * 60)) | wc -l)

if [ "$STATUS" = "SUCCESS" ] && [ "$BACKUP_AGE" -eq 0 ]; then
    echo "OK: Latest backup successful and recent"
    exit 0
elif [ "$STATUS" = "FAILED" ]; then
    echo "CRITICAL: Last backup failed"
    exit 2
elif [ "$BACKUP_AGE" -gt 0 ]; then
    echo "WARNING: Latest backup is older than $MAX_AGE_HOURS hours"
    exit 1
else
    echo "WARNING: Unknown backup status"
    exit 1
fi
EOF

chmod 755 /opt/cassandra/scripts/check-backup-health.sh
chown cassandra:cassandra /opt/cassandra/scripts/check-backup-health.sh

# Verify installation
log_info "Verifying installation..."
if systemctl is-active --quiet cassandra; then
    log_info "✓ Cassandra service is running"
else
    log_error "✗ Cassandra service is not running"
    exit 1
fi

if systemctl is-enabled --quiet cassandra-backup.timer; then
    log_info "✓ Backup timer is enabled"
else
    log_error "✗ Backup timer is not enabled"
    exit 1
fi

if systemctl is-active --quiet cassandra-backup.timer; then
    log_info "✓ Backup timer is running"
else
    log_error "✗ Backup timer is not running"
    exit 1
fi

log_info "Cassandra backup automation setup completed successfully!"
log_info "Backup timer status: $(systemctl list-timers cassandra-backup.timer --no-pager)"
log_info "Test backup script: sudo -u cassandra /opt/cassandra/scripts/cassandra-backup.sh"
log_info "Check backup health: /opt/cassandra/scripts/check-backup-health.sh"
log_info "View backup logs: journalctl -u cassandra-backup.service"

Review the script before running. Execute with: bash install.sh

#cassandra #backup #nodetool #systemd #automation

Set up Cassandra backup automation with nodetool

Prerequisites

What this solves

Step-by-step installation

Update system packages

Install Apache Cassandra

Configure Cassandra data directories

Enable Cassandra service

Create backup script

Cassandra backup script with nodetool

Usage: ./cassandra-backup.sh

Configuration

Logging function

Create log directory

Check if Cassandra is running

Create snapshot for all keyspaces

Copy snapshots to backup directory

Find and copy all snapshot files

Create backup metadata

Compress backup

Calculate backup size

Clean old snapshots from Cassandra

Clean old backup files

Verify backup integrity

Set script permissions

Create systemd service unit

Security settings

Resource limits

Create systemd timer unit

Run daily at 2 AM with 30-minute random delay

Enable backup automation

Create monitoring script

Cassandra backup health check script

Returns 0 if backup is healthy, 1 if issues found

Check if backup directory exists

Check last backup status

Check for recent backup files

Check backup file age

Check backup file integrity

Check backup size (should be > 1KB)

Check Cassandra service

Check disk space (warn if backup partition < 10% free)

Set monitoring script permissions

Configure log rotation

Test backup process

Set up monitoring and alerting

Create health check timer

Create health check timer

Enable health monitoring

Configure email alerts

Configure as "Local only" during setup

Create alert service

Configure systemd failure alerts

Reload and test alerting

Test the health check manually

Configure retention and cleanup policies

Create cleanup script

Advanced backup cleanup script

Implements tiered retention policy

Retention policy

Check current disk usage

Remove backups older than daily retention

If disk usage is high, be more aggressive

Count remaining backups

Calculate total backup space

Set cleanup script permissions

Create cleanup timer

Create cleanup timer unit

Enable cleanup automation

Verify your setup

Check health monitoring timer

Check cleanup timer

List recent timer executions

Check backup directory

Run manual health check

Check Cassandra cluster status

View backup logs

Common issues

Next steps

Running this in production?

`Configure as "Local only" during setup`