Set up Cassandra backup automation with nodetool

Intermediate 45 min May 01, 2026 74 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Automate Apache Cassandra backups using nodetool snapshots, systemd timers, and retention policies. Configure monitoring and alerting for production-grade backup management with automated cleanup and verification.

Prerequisites

  • Root or sudo access
  • At least 4GB RAM
  • 50GB+ disk space for backups
  • Basic familiarity with systemd timers

What this solves

Manual Cassandra backups are error-prone and often forgotten until disaster strikes. This tutorial sets up automated snapshot backups using nodetool, systemd timers for scheduling, and retention policies for storage management. You'll also configure monitoring to track backup success and get alerts when backups fail.

Step-by-step installation

Update system packages

Start by updating your package manager to ensure you get the latest versions of dependencies.

sudo apt update && sudo apt upgrade -y
sudo dnf update -y

Install Apache Cassandra

Install Cassandra and Java runtime if not already present. This includes the nodetool command required for backup operations.

wget -q -O - https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -
echo "deb https://debian.cassandra.apache.org 40x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
sudo apt update
sudo apt install -y cassandra openjdk-11-jdk
sudo dnf install -y java-11-openjdk
sudo tee /etc/yum.repos.d/cassandra.repo << EOF
[cassandra]
name=Apache Cassandra
baseurl=https://redhat.cassandra.apache.org/40x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://www.apache.org/dist/cassandra/KEYS
EOF
sudo dnf install -y cassandra

Configure Cassandra data directories

Set up proper directories for Cassandra data and backups with correct permissions. The cassandra user needs write access to backup locations.

sudo mkdir -p /var/lib/cassandra/backups
sudo mkdir -p /opt/cassandra/scripts
sudo chown -R cassandra:cassandra /var/lib/cassandra
sudo chmod 755 /var/lib/cassandra/backups

Enable Cassandra service

Start Cassandra and enable it to run on boot. Verify the service is running correctly before proceeding.

sudo systemctl enable --now cassandra
sudo systemctl status cassandra
nodetool status

Create backup script

Create a comprehensive backup script that handles snapshots, cleanup, and logging. This script creates consistent snapshots across all keyspaces.

#!/bin/bash

Cassandra backup script with nodetool

Usage: ./cassandra-backup.sh

set -euo pipefail

Configuration

BACKUP_DIR="/var/lib/cassandra/backups" LOG_FILE="/var/log/cassandra/backup.log" RETENTION_DAYS=7 TIMESTAMP=$(date +"%Y%m%d_%H%M%S") SNAPSHOT_NAME="backup_${TIMESTAMP}"

Logging function

log() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE" }

Create log directory

sudo mkdir -p $(dirname "$LOG_FILE") sudo chown cassandra:cassandra $(dirname "$LOG_FILE") log "Starting Cassandra backup: $SNAPSHOT_NAME"

Check if Cassandra is running

if ! nodetool status > /dev/null 2>&1; then log "ERROR: Cassandra is not running or nodetool failed" exit 1 fi

Create snapshot for all keyspaces

log "Creating snapshot: $SNAPSHOT_NAME" if nodetool snapshot -t "$SNAPSHOT_NAME"; then log "Snapshot created successfully: $SNAPSHOT_NAME" else log "ERROR: Failed to create snapshot" exit 1 fi

Copy snapshots to backup directory

log "Copying snapshots to backup directory" BACKUP_PATH="$BACKUP_DIR/$SNAPSHOT_NAME" mkdir -p "$BACKUP_PATH"

Find and copy all snapshot files

find /var/lib/cassandra/data -name "$SNAPSHOT_NAME" -type d | while read snapshot_dir; do # Extract keyspace and table info from path relative_path=$(echo "$snapshot_dir" | sed 's|/var/lib/cassandra/data/||') target_dir="$BACKUP_PATH/$relative_path" mkdir -p "$(dirname "$target_dir")" cp -r "$snapshot_dir" "$(dirname "$target_dir")/" log "Copied: $relative_path" done

Create backup metadata

cat > "$BACKUP_PATH/backup_info.txt" << EOF Backup Name: $SNAPSHOT_NAME Backup Date: $(date) Cassandra Version: $(nodetool version) Node Status: $(nodetool status) Keyspaces: $(nodetool describecluster | grep -A 10 "Schema versions") EOF

Compress backup

log "Compressing backup" cd "$BACKUP_DIR" tar -czf "${SNAPSHOT_NAME}.tar.gz" "$SNAPSHOT_NAME"/ rm -rf "$SNAPSHOT_NAME"

Calculate backup size

BACKUP_SIZE=$(du -h "${SNAPSHOT_NAME}.tar.gz" | cut -f1) log "Backup compressed: ${SNAPSHOT_NAME}.tar.gz ($BACKUP_SIZE)"

Clean old snapshots from Cassandra

log "Cleaning old snapshots from Cassandra data directory" nodetool clearsnapshot

Clean old backup files

log "Cleaning backups older than $RETENTION_DAYS days" find "$BACKUP_DIR" -name "backup_*.tar.gz" -mtime +$RETENTION_DAYS -delete log "Backup completed successfully: ${SNAPSHOT_NAME}.tar.gz"

Verify backup integrity

if tar -tzf "${SNAPSHOT_NAME}.tar.gz" > /dev/null 2>&1; then log "Backup integrity verified" echo "SUCCESS" > "$BACKUP_DIR/.last_backup_status" else log "ERROR: Backup integrity check failed" echo "FAILED" > "$BACKUP_DIR/.last_backup_status" exit 1 fi exit 0

Set script permissions

Make the backup script executable and ensure proper ownership for the cassandra user.

sudo chmod +x /opt/cassandra/scripts/cassandra-backup.sh
sudo chown cassandra:cassandra /opt/cassandra/scripts/cassandra-backup.sh

Create systemd service unit

Create a systemd service to run the backup script with proper user context and logging.

[Unit]
Description=Cassandra Backup Service
Requires=cassandra.service
After=cassandra.service

[Service]
Type=oneshot
User=cassandra
Group=cassandra
ExecStart=/opt/cassandra/scripts/cassandra-backup.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=cassandra-backup

Security settings

NoNewPrivileges=true PrivateTmp=true ProtectSystem=strict ProtectHome=true ReadWritePaths=/var/lib/cassandra /var/log/cassandra

Resource limits

MemoryLimit=512M TimeoutSec=3600

Create systemd timer unit

Configure a systemd timer to run backups automatically. This schedules daily backups at 2 AM with randomized delay to avoid resource conflicts.

[Unit]
Description=Run Cassandra Backup Daily
Requires=cassandra-backup.service

[Timer]

Run daily at 2 AM with 30-minute random delay

OnCalendar=02:00 RandomizedDelaySec=1800 Persistent=true [Install] WantedBy=timers.target

Enable backup automation

Reload systemd configuration and enable the backup timer to start on boot.

sudo systemctl daemon-reload
sudo systemctl enable cassandra-backup.timer
sudo systemctl start cassandra-backup.timer

Create monitoring script

Create a monitoring script to check backup status and generate alerts when backups fail.

#!/bin/bash

Cassandra backup health check script

Returns 0 if backup is healthy, 1 if issues found

set -euo pipefail BACKUP_DIR="/var/lib/cassandra/backups" LOG_FILE="/var/log/cassandra/backup-health.log" MAX_AGE_HOURS=25 # Allow 1 hour past daily schedule log() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE" }

Check if backup directory exists

if [[ ! -d "$BACKUP_DIR" ]]; then log "ERROR: Backup directory does not exist: $BACKUP_DIR" exit 1 fi

Check last backup status

if [[ -f "$BACKUP_DIR/.last_backup_status" ]]; then LAST_STATUS=$(cat "$BACKUP_DIR/.last_backup_status") if [[ "$LAST_STATUS" != "SUCCESS" ]]; then log "ERROR: Last backup failed with status: $LAST_STATUS" exit 1 fi else log "WARNING: No backup status file found" fi

Check for recent backup files

LATEST_BACKUP=$(find "$BACKUP_DIR" -name "backup_*.tar.gz" -mtime -1 | head -1) if [[ -z "$LATEST_BACKUP" ]]; then log "ERROR: No backup files found within last 24 hours" exit 1 fi

Check backup file age

FILE_AGE_HOURS=$(( ($(date +%s) - $(stat -c %Y "$LATEST_BACKUP")) / 3600 )) if [[ $FILE_AGE_HOURS -gt $MAX_AGE_HOURS ]]; then log "ERROR: Latest backup is too old: $FILE_AGE_HOURS hours (max: $MAX_AGE_HOURS)" exit 1 fi

Check backup file integrity

if ! tar -tzf "$LATEST_BACKUP" > /dev/null 2>&1; then log "ERROR: Latest backup file is corrupted: $LATEST_BACKUP" exit 1 fi

Check backup size (should be > 1KB)

FILE_SIZE=$(stat -c%s "$LATEST_BACKUP") if [[ $FILE_SIZE -lt 1024 ]]; then log "ERROR: Backup file is too small: $FILE_SIZE bytes" exit 1 fi

Check Cassandra service

if ! systemctl is-active --quiet cassandra; then log "ERROR: Cassandra service is not running" exit 1 fi

Check disk space (warn if backup partition < 10% free)

BACKUP_PARTITION=$(df "$BACKUP_DIR" | awk 'NR==2 {print $5}' | sed 's/%//') if [[ $BACKUP_PARTITION -gt 90 ]]; then log "WARNING: Backup partition is ${BACKUP_PARTITION}% full" fi log "Backup health check passed - Latest backup: $(basename "$LATEST_BACKUP") (${FILE_AGE_HOURS}h old)" exit 0

Set monitoring script permissions

Make the health check script executable and set correct ownership.

sudo chmod +x /opt/cassandra/scripts/check-backup-health.sh
sudo chown cassandra:cassandra /opt/cassandra/scripts/check-backup-health.sh

Configure log rotation

Set up log rotation for backup logs to prevent disk space issues.

/var/log/cassandra/backup*.log {
    daily
    rotate 30
    compress
    delaycompress
    missingok
    notifempty
    create 0644 cassandra cassandra
    postrotate
        # No need to restart services for these logs
    endscript
}

Test backup process

Run a manual backup to verify everything works correctly before relying on the automated schedule.

sudo -u cassandra /opt/cassandra/scripts/cassandra-backup.sh

Set up monitoring and alerting

Create health check timer

Set up regular health checks to monitor backup status and alert on failures.

[Unit]
Description=Cassandra Backup Health Check
After=network.target

[Service]
Type=oneshot
User=cassandra
Group=cassandra
ExecStart=/opt/cassandra/scripts/check-backup-health.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=cassandra-backup-health

Create health check timer

Configure the health check to run every 6 hours to catch backup issues quickly.

[Unit]
Description=Run Cassandra Backup Health Check
Requires=cassandra-backup-health.service

[Timer]
OnBootSec=30min
OnUnitActiveSec=6h
Persistent=true

[Install]
WantedBy=timers.target

Enable health monitoring

Start the health check timer to begin monitoring your backup system.

sudo systemctl daemon-reload
sudo systemctl enable cassandra-backup-health.timer
sudo systemctl start cassandra-backup-health.timer

Configure email alerts

Install and configure mail service for backup failure notifications. This uses postfix for local mail delivery.

sudo apt install -y postfix mailutils

Configure as "Local only" during setup

sudo dnf install -y postfix mailx
sudo systemctl enable --now postfix

Create alert service

Create a service that sends email alerts when backup health checks fail.

[Unit]
Description=Cassandra Backup Health Alert for %i
After=network.target

[Service]
Type=oneshot
ExecStart=/bin/bash -c 'echo "Cassandra backup health check failed on $(hostname) at $(date). Check /var/log/cassandra/backup-health.log for details." | mail -s "ALERT: Cassandra Backup Failed on $(hostname)" root'
StandardOutput=journal
StandardError=journal

Configure systemd failure alerts

Set up the health check service to trigger email alerts on failure using systemd's OnFailure directive.

sudo mkdir -p /etc/systemd/system/cassandra-backup-health.service.d
sudo tee /etc/systemd/system/cassandra-backup-health.service.d/alerts.conf << EOF
[Unit]
OnFailure=cassandra-backup-health@%n.service
EOF

Reload and test alerting

Reload systemd configuration and test the alert system.

sudo systemctl daemon-reload

Test the health check manually

sudo systemctl start cassandra-backup-health.service

Configure retention and cleanup policies

Create cleanup script

Create an additional cleanup script for advanced retention policies based on backup age and disk usage.

#!/bin/bash

Advanced backup cleanup script

Implements tiered retention policy

set -euo pipefail BACKUP_DIR="/var/lib/cassandra/backups" LOG_FILE="/var/log/cassandra/cleanup.log"

Retention policy

DAILY_KEEP=7 # Keep daily backups for 7 days WEEKLY_KEEP=4 # Keep weekly backups for 4 weeks MONTHLY_KEEP=12 # Keep monthly backups for 12 months DISK_THRESHOLD=85 # Clean more aggressively if disk usage > 85% log() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE" } log "Starting backup cleanup"

Check current disk usage

DISK_USAGE=$(df "$BACKUP_DIR" | awk 'NR==2 {print $5}' | sed 's/%//') log "Current disk usage: ${DISK_USAGE}%" cd "$BACKUP_DIR"

Remove backups older than daily retention

log "Removing daily backups older than $DAILY_KEEP days" find . -name "backup_*.tar.gz" -mtime +$DAILY_KEEP -delete

If disk usage is high, be more aggressive

if [[ $DISK_USAGE -gt $DISK_THRESHOLD ]]; then log "Disk usage high (${DISK_USAGE}%), applying aggressive cleanup" # Keep only last 3 days find . -name "backup_*.tar.gz" -mtime +3 -delete fi

Count remaining backups

REMAINING=$(find . -name "backup_*.tar.gz" | wc -l) log "Cleanup completed. Remaining backups: $REMAINING"

Calculate total backup space

TOTAL_SIZE=$(du -sh . | cut -f1) log "Total backup space used: $TOTAL_SIZE" exit 0

Set cleanup script permissions

Make the cleanup script executable and set proper ownership.

sudo chmod +x /opt/cassandra/scripts/cleanup-old-backups.sh
sudo chown cassandra:cassandra /opt/cassandra/scripts/cleanup-old-backups.sh

Create cleanup timer

Schedule the cleanup script to run weekly to maintain storage efficiency.

[Unit]
Description=Cassandra Backup Cleanup Service
After=network.target

[Service]
Type=oneshot
User=cassandra
Group=cassandra
ExecStart=/opt/cassandra/scripts/cleanup-old-backups.sh
StandardOutput=journal
StandardError=journal
SyslogIdentifier=cassandra-cleanup

Create cleanup timer unit

Schedule cleanup to run weekly on Sunday mornings.

[Unit]
Description=Run Cassandra Backup Cleanup Weekly
Requires=cassandra-cleanup.service

[Timer]
OnCalendar=Sun 03:00
RandomizedDelaySec=3600
Persistent=true

[Install]
WantedBy=timers.target

Enable cleanup automation

Enable the cleanup timer to maintain your backup retention policy automatically.

sudo systemctl daemon-reload
sudo systemctl enable cassandra-cleanup.timer
sudo systemctl start cassandra-cleanup.timer

Verify your setup

Run these commands to verify your backup automation is working correctly.

# Check backup timer status
sudo systemctl status cassandra-backup.timer

Check health monitoring timer

sudo systemctl status cassandra-backup-health.timer

Check cleanup timer

sudo systemctl status cassandra-cleanup.timer

List recent timer executions

sudo systemctl list-timers cassandra-*

Check backup directory

ls -la /var/lib/cassandra/backups/

Run manual health check

sudo -u cassandra /opt/cassandra/scripts/check-backup-health.sh

Check Cassandra cluster status

nodetool status

View backup logs

sudo tail -f /var/log/cassandra/backup.log

Common issues

Symptom Cause Fix
Backup script fails with permission denied Incorrect file ownership or permissions sudo chown -R cassandra:cassandra /var/lib/cassandra /opt/cassandra
nodetool snapshot command fails Cassandra service not running or JMX connectivity issues Check sudo systemctl status cassandra and firewall rules
Backup files are empty or very small No data in keyspaces or snapshot creation failed Check Cassandra logs and verify keyspaces exist with nodetool describecluster
Health check reports backup is too old Backup timer not running or backup script failing silently Check timer status and backup logs for errors
Email alerts not working Postfix not configured or root mail not set up Configure postfix and set up root mail forwarding in /etc/aliases
Disk space issues with backups Retention policy not working or cleanup timer disabled Run cleanup manually and verify timer is enabled
Systemd timers not triggering Timer not enabled or systemd clock issues sudo systemctl enable timer-name.timer and check system time
Note: Always test backup restoration procedures regularly. A backup is only as good as your ability to restore from it. Consider setting up a separate test environment to practice restoration.

Next steps

Running this in production?

Ready for production scale? Setting up Cassandra backup automation is straightforward. Keeping it monitored, tuned for performance, and integrated with disaster recovery across environments is the harder part. See how we run infrastructure like this for European SaaS and fintech teams.

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle high availability infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.