Linux Disk Usage Monitoring & Cleanup with systemd

Set up automated disk monitoring, log cleanup, and email alerts using systemd timers to prevent disk space issues. Configure log rotation, temporary file cleanup, and threshold-based alerting for production systems.

Prerequisites

Root or sudo access
Basic familiarity with systemd
Email server configuration knowledge

What this solves

Running out of disk space can cause system failures, service outages, and data loss in production environments. This tutorial sets up automated disk monitoring with email alerts when thresholds are reached, configures systemd timers for regular cleanup tasks, and implements log rotation to prevent uncontrolled disk usage growth.

Step-by-step configuration

Update system packages

Start by updating your package manager to ensure you have the latest system tools and utilities.

sudo apt update && sudo apt upgrade -y

sudo dnf update -y

Install monitoring and mail utilities

Install the necessary packages for disk monitoring, email notifications, and system utilities.

sudo apt install -y mailutils postfix logrotate ncdu tree

sudo dnf install -y mailx postfix logrotate ncdu tree

Create disk monitoring script

Create a script that checks disk usage and sends email alerts when thresholds are exceeded.

sudo mkdir -p /opt/disk-monitor
sudo tee /opt/disk-monitor/disk-check.sh > /dev/null << 'EOF'
#!/bin/bash

Configuration
THRESHOLD_WARNING=80
THRESHOLD_CRITICAL=90
EMAIL_RECIPIENT="admin@example.com"
HOSTNAME=$(hostname -f)
LOG_FILE="/var/log/disk-monitor.log"

Function to log messages
log_message() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
}

Function to send email alert
send_alert() {
    local severity=$1
    local filesystem=$2
    local usage=$3
    local available=$4
    
    local subject="[$severity] Disk Space Alert - $HOSTNAME"
    local body="Disk space alert for $HOSTNAME:

Filesystem: $filesystem
Usage: $usage%
Available: $available
Threshold: ${severity,,} at ${THRESHOLD_WARNING}%/${THRESHOLD_CRITICAL}%

Please investigate and free up disk space immediately."
    
    echo -e "$body" | mail -s "$subject" "$EMAIL_RECIPIENT"
    log_message "$severity alert sent for $filesystem ($usage% used)"
}

Check disk usage for all mounted filesystems
df -h | awk 'NR>1 && !/tmpfs|devtmpfs|udev/ {print $5 " " $6 " " $4}' | while read output; do
    usage=$(echo $output | awk '{print $1}' | sed 's/%//')
    filesystem=$(echo $output | awk '{print $2}')
    available=$(echo $output | awk '{print $3}')
    
    if [ $usage -ge $THRESHOLD_CRITICAL ]; then
        send_alert "CRITICAL" "$filesystem" "$usage" "$available"
    elif [ $usage -ge $THRESHOLD_WARNING ]; then
        send_alert "WARNING" "$filesystem" "$usage" "$available"
    fi
done

Log successful completion
log_message "Disk check completed successfully"
EOF

Make the monitoring script executable

Set proper permissions on the disk monitoring script to allow execution by the system.

sudo chmod 755 /opt/disk-monitor/disk-check.sh
sudo chown root:root /opt/disk-monitor/disk-check.sh

Create cleanup script for temporary files

Create a script to automatically clean up temporary files, old logs, and cache directories.

sudo tee /opt/disk-monitor/cleanup.sh > /dev/null << 'EOF'
#!/bin/bash

LOG_FILE="/var/log/disk-cleanup.log"
CLEANED_SPACE=0

Function to log messages with space saved
log_cleanup() {
    local action=$1
    local space_before=$2
    local space_after=$3
    local saved=$((space_before - space_after))
    CLEANED_SPACE=$((CLEANED_SPACE + saved))
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $action: ${saved}KB freed" >> "$LOG_FILE"
}

Function to get directory size in KB
get_size() {
    du -sk "$1" 2>/dev/null | cut -f1 || echo 0
}

echo "$(date '+%Y-%m-%d %H:%M:%S') - Starting cleanup process" >> "$LOG_FILE"

Clean temporary directories
for temp_dir in "/tmp" "/var/tmp"; do
    if [ -d "$temp_dir" ]; then
        before=$(get_size "$temp_dir")
        find "$temp_dir" -type f -atime +7 -delete 2>/dev/null
        find "$temp_dir" -type d -empty -delete 2>/dev/null
        after=$(get_size "$temp_dir")
        log_cleanup "Cleaned $temp_dir" "$before" "$after"
    fi
done

Clean old log files (older than 30 days)
if [ -d "/var/log" ]; then
    before=$(get_size "/var/log")
    find /var/log -name ".log..gz" -mtime +30 -delete 2>/dev/null
    find /var/log -name ".log." -mtime +30 -delete 2>/dev/null
    after=$(get_size "/var/log")
    log_cleanup "Cleaned old logs" "$before" "$after"
fi

Clean package cache
if command -v apt-get >/dev/null 2>&1; then
    before=$(get_size "/var/cache/apt")
    apt-get clean >/dev/null 2>&1
    after=$(get_size "/var/cache/apt")
    log_cleanup "APT cache cleanup" "$before" "$after"
elif command -v dnf >/dev/null 2>&1; then
    before=$(get_size "/var/cache/dnf")
    dnf clean all >/dev/null 2>&1
    after=$(get_size "/var/cache/dnf")
    log_cleanup "DNF cache cleanup" "$before" "$after"
fi

Clean journal logs older than 30 days
before=$(journalctl --disk-usage 2>/dev/null | grep -oE '[0-9.]+[KMGT]B' | head -1 | sed 's/[^0-9.]//g' || echo 0)
journalctl --vacuum-time=30d >/dev/null 2>&1
after=$(journalctl --disk-usage 2>/dev/null | grep -oE '[0-9.]+[KMGT]B' | head -1 | sed 's/[^0-9.]//g' || echo 0)
log_cleanup "Journal cleanup" "$before" "$after"

echo "$(date '+%Y-%m-%d %H:%M:%S') - Cleanup completed. Total space freed: ${CLEANED_SPACE}KB" >> "$LOG_FILE"
EOF

Make cleanup script executable

Set proper permissions on the cleanup script and ensure it's owned by root for security.

sudo chmod 755 /opt/disk-monitor/cleanup.sh
sudo chown root:root /opt/disk-monitor/cleanup.sh

Create systemd service for disk monitoring

Create a systemd service unit that will run the disk monitoring script.

[Unit]
Description=Disk Usage Monitor
Wants=network-online.target
After=network-online.target

[Service]
Type=oneshot
User=root
Group=root
ExecStart=/opt/disk-monitor/disk-check.sh
StandardOutput=journal
StandardError=journal

Create systemd service for cleanup

Create a systemd service unit for the automated cleanup tasks.

[Unit]
Description=Disk Cleanup Service
Wants=network-online.target
After=network-online.target

[Service]
Type=oneshot
User=root
Group=root
ExecStart=/opt/disk-monitor/cleanup.sh
StandardOutput=journal
StandardError=journal

Create systemd timers

Create timer units to schedule regular execution of the monitoring and cleanup services.

[Unit]
Description=Run disk monitor every 15 minutes
Requires=disk-monitor.service

[Timer]
OnBootSec=5min
OnUnitActiveSec=15min
Persistent=true

[Install]
WantedBy=timers.target

[Unit]
Description=Run disk cleanup daily at 2 AM
Requires=disk-cleanup.service

[Timer]
OnCalendar=--* 02:00:00
Persistent=true

[Install]
WantedBy=timers.target

Configure enhanced log rotation

Set up comprehensive log rotation to prevent log files from consuming excessive disk space.

# Disk monitor logs
/var/log/disk-monitor.log {
    weekly
    missingok
    rotate 12
    compress
    delaycompress
    notifempty
    create 644 root root
}

/var/log/disk-cleanup.log {
    weekly
    missingok
    rotate 12
    compress
    delaycompress
    notifempty
    create 644 root root
}

Enhanced system log rotation
/var/log/syslog {
    daily
    missingok
    rotate 14
    compress
    delaycompress
    notifempty
    create 644 syslog adm
    postrotate
        systemctl reload rsyslog
    endscript
}

/var/log/auth.log {
    weekly
    missingok
    rotate 8
    compress
    delaycompress
    notifempty
    create 644 syslog adm
    postrotate
        systemctl reload rsyslog
    endscript
}

Configure Postfix for email notifications

Set up basic Postfix configuration for sending email alerts. This configuration works for most cloud providers.

sudo debconf-set-selections <<< "postfix postfix/mailname string $(hostname -f)"
sudo debconf-set-selections <<< "postfix postfix/main_mailer_type string 'Internet Site'"

# Add or modify these settings in /etc/postfix/main.cf
sudo postconf -e "myhostname = $(hostname -f)"
sudo postconf -e "mydestination = \$myhostname, localhost.\$mydomain, localhost"
sudo postconf -e "relayhost = "
sudo postconf -e "mynetworks = 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128"
sudo postconf -e "inet_protocols = ipv4"

Enable and start services

Reload systemd configuration and enable the timer services to start automatically.

sudo systemctl daemon-reload
sudo systemctl enable --now disk-monitor.timer
sudo systemctl enable --now disk-cleanup.timer
sudo systemctl enable --now postfix

Create log directories and initial files

Ensure proper log directories exist with correct permissions for monitoring and cleanup scripts.

sudo touch /var/log/disk-monitor.log /var/log/disk-cleanup.log
sudo chmod 644 /var/log/disk-monitor.log /var/log/disk-cleanup.log
sudo chown root:root /var/log/disk-monitor.log /var/log/disk-cleanup.log

Verify your setup

Check that all timers are active and services are properly configured.

# Check timer status
sudo systemctl status disk-monitor.timer disk-cleanup.timer

List all active timers
sudo systemctl list-timers

Test the monitoring script manually
sudo /opt/disk-monitor/disk-check.sh

Test the cleanup script manually
sudo /opt/disk-monitor/cleanup.sh

Check log files were created
ls -la /var/log/disk-*.log

Verify current disk usage
df -h

Check postfix is running
sudo systemctl status postfix

Note: The first run of monitoring may not send emails if disk usage is below thresholds. You can temporarily lower the threshold values in the script to test email functionality.

Advanced configuration

Add filesystem-specific monitoring

Create custom thresholds for specific filesystems that may need different monitoring levels.

# Custom thresholds per filesystem
Format: filesystem:warning_threshold:critical_threshold
/:85:95
/var:80:90
/home:75:85
/tmp:90:95

Configure logrotate for application logs

Add rotation rules for common application log directories to prevent them from filling the disk.

# Nginx logs
/var/log/nginx/*.log {
    daily
    missingok
    rotate 30
    compress
    delaycompress
    notifempty
    create 644 www-data adm
    sharedscripts
    postrotate
        systemctl reload nginx
    endscript
}

Apache logs
/var/log/apache2/*.log {
    weekly
    missingok
    rotate 12
    compress
    delaycompress
    notifempty
    create 644 www-data adm
    sharedscripts
    postrotate
        systemctl reload apache2
    endscript
}

Common issues

Symptom	Cause	Fix
Timer not running	Service not enabled	`sudo systemctl enable --now disk-monitor.timer`
No email alerts	Postfix not configured	Check `sudo systemctl status postfix` and mail logs
Script permission denied	Incorrect file permissions	`sudo chmod 755 /opt/disk-monitor/*.sh`
Cleanup not working	Insufficient permissions	Ensure scripts run as root user in service files
Log rotation fails	Service reload issues	Check service status and logrotate configuration syntax
High disk usage persists	Large files not cleaned	Use `ncdu /` to identify large directories manually

Monitoring and maintenance

Regular maintenance tasks to keep your disk monitoring system healthy.

# View recent timer executions
sudo systemctl list-timers --all

Check monitoring logs
sudo tail -f /var/log/disk-monitor.log

Check cleanup logs
sudo tail -f /var/log/disk-cleanup.log

Test logrotate manually
sudo logrotate -d /etc/logrotate.conf

Force logrotate to run
sudo logrotate -f /etc/logrotate.conf

Analyze disk usage with ncdu
sudo ncdu /var/log

Check journal disk usage
journalctl --disk-usage

Next steps

Automated install script

Run this to automate the entire setup

install.sh

#!/usr/bin/env bash
set -euo pipefail

# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'

# Default configuration
EMAIL_RECIPIENT="${1:-root@localhost}"
WARNING_THRESHOLD="${2:-80}"
CRITICAL_THRESHOLD="${3:-90}"

usage() {
    echo "Usage: $0 [email_recipient] [warning_threshold] [critical_threshold]"
    echo "Example: $0 admin@example.com 75 85"
    exit 1
}

log_message() {
    echo -e "${GREEN}[INFO]${NC} $1"
}

log_warning() {
    echo -e "${YELLOW}[WARNING]${NC} $1"
}

log_error() {
    echo -e "${RED}[ERROR]${NC} $1"
}

cleanup_on_error() {
    log_error "Installation failed. Cleaning up..."
    systemctl disable disk-monitor.timer 2>/dev/null || true
    systemctl disable disk-cleanup.timer 2>/dev/null || true
    rm -rf /opt/disk-monitor
    rm -f /etc/systemd/system/disk-{monitor,cleanup}.{service,timer}
    systemctl daemon-reload
}

trap cleanup_on_error ERR

# Check if running as root
if [[ $EUID -ne 0 ]]; then
    log_error "This script must be run as root"
    exit 1
fi

# Validate thresholds
if [[ ! "$WARNING_THRESHOLD" =~ ^[0-9]+$ ]] || [[ ! "$CRITICAL_THRESHOLD" =~ ^[0-9]+$ ]]; then
    log_error "Thresholds must be numeric"
    usage
fi

if [[ $WARNING_THRESHOLD -ge $CRITICAL_THRESHOLD ]]; then
    log_error "Warning threshold must be less than critical threshold"
    usage
fi

# Detect distribution
log_message "[1/8] Detecting distribution..."
if [ -f /etc/os-release ]; then
    . /etc/os-release
    case "$ID" in
        ubuntu|debian)
            PKG_MGR="apt"
            PKG_INSTALL="apt install -y"
            MAIL_PACKAGE="mailutils"
            ;;
        almalinux|rocky|centos|rhel|ol|fedora)
            PKG_MGR="dnf"
            PKG_INSTALL="dnf install -y"
            MAIL_PACKAGE="mailx"
            ;;
        amzn)
            PKG_MGR="yum"
            PKG_INSTALL="yum install -y"
            MAIL_PACKAGE="mailx"
            ;;
        *)
            log_error "Unsupported distribution: $ID"
            exit 1
            ;;
    esac
else
    log_error "Cannot detect distribution"
    exit 1
fi

log_message "Detected: $PRETTY_NAME"

# Update system packages
log_message "[2/8] Updating system packages..."
if [[ $PKG_MGR == "apt" ]]; then
    apt update && apt upgrade -y
else
    $PKG_INSTALL update -y
fi

# Install required packages
log_message "[3/8] Installing monitoring utilities..."
$PKG_INSTALL $MAIL_PACKAGE postfix logrotate ncdu tree

# Create monitoring directory
log_message "[4/8] Creating monitoring directory and scripts..."
mkdir -p /opt/disk-monitor
mkdir -p /var/log/disk-monitor
chown root:root /opt/disk-monitor
chmod 755 /opt/disk-monitor

# Create disk monitoring script
cat > /opt/disk-monitor/disk-check.sh << 'EOF'
#!/bin/bash
EMAIL_RECIPIENT="EMAIL_PLACEHOLDER"
THRESHOLD_WARNING=WARNING_PLACEHOLDER
THRESHOLD_CRITICAL=CRITICAL_PLACEHOLDER
LOG_FILE="/var/log/disk-monitor/disk-check.log"
HOSTNAME=$(hostname)

log_message() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
}

send_alert() {
    local severity=$1
    local filesystem=$2
    local usage=$3
    local available=$4
    
    local subject="[$severity] Disk Space Alert - $HOSTNAME"
    local body="Disk space alert for $HOSTNAME:

Filesystem: $filesystem
Usage: $usage%
Available: $available
Threshold: ${severity,,} at ${THRESHOLD_WARNING}%/${THRESHOLD_CRITICAL}%

Please investigate and free up disk space immediately."
    
    echo -e "$body" | mail -s "$subject" "$EMAIL_RECIPIENT" 2>/dev/null || true
    log_message "$severity alert sent for $filesystem ($usage% used)"
}

df -h | awk 'NR>1 && !/tmpfs|devtmpfs|udev/ {print $5 " " $6 " " $4}' | while read output; do
    usage=$(echo $output | awk '{print $1}' | sed 's/%//')
    filesystem=$(echo $output | awk '{print $2}')
    available=$(echo $output | awk '{print $3}')
    
    if [ $usage -ge $THRESHOLD_CRITICAL ]; then
        send_alert "CRITICAL" "$filesystem" "$usage" "$available"
    elif [ $usage -ge $THRESHOLD_WARNING ]; then
        send_alert "WARNING" "$filesystem" "$usage" "$available"
    fi
done

log_message "Disk check completed successfully"
EOF

# Replace placeholders in monitoring script
sed -i "s/EMAIL_PLACEHOLDER/$EMAIL_RECIPIENT/g" /opt/disk-monitor/disk-check.sh
sed -i "s/WARNING_PLACEHOLDER/$WARNING_THRESHOLD/g" /opt/disk-monitor/disk-check.sh
sed -i "s/CRITICAL_PLACEHOLDER/$CRITICAL_THRESHOLD/g" /opt/disk-monitor/disk-check.sh

# Create cleanup script
cat > /opt/disk-monitor/cleanup.sh << 'EOF'
#!/bin/bash
LOG_FILE="/var/log/disk-monitor/cleanup.log"

log_cleanup() {
    local action=$1
    local before=$2
    local after=$3
    local saved=$((before - after))
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $action: Freed ${saved}KB" >> "$LOG_FILE"
}

get_size() {
    du -sk "$1" 2>/dev/null | cut -f1 || echo 0
}

echo "$(date '+%Y-%m-%d %H:%M:%S') - Starting cleanup process" >> "$LOG_FILE"

for temp_dir in "/tmp" "/var/tmp"; do
    if [ -d "$temp_dir" ]; then
        before=$(get_size "$temp_dir")
        find "$temp_dir" -type f -atime +7 -delete 2>/dev/null || true
        find "$temp_dir" -type d -empty -delete 2>/dev/null || true
        after=$(get_size "$temp_dir")
        log_cleanup "Cleaned $temp_dir" "$before" "$after"
    fi
done

if [ -d "/var/log" ]; then
    before=$(get_size "/var/log")
    find /var/log -name "*.log.*.gz" -mtime +30 -delete 2>/dev/null || true
    find /var/log -name "*.log.*" -mtime +30 -delete 2>/dev/null || true
    after=$(get_size "/var/log")
    log_cleanup "Cleaned old logs" "$before" "$after"
fi

if command -v apt-get >/dev/null 2>&1; then
    before=$(get_size "/var/cache/apt")
    apt-get clean >/dev/null 2>&1 || true
    after=$(get_size "/var/cache/apt")
    log_cleanup "APT cache cleanup" "$before" "$after"
elif command -v dnf >/dev/null 2>&1; then
    before=$(get_size "/var/cache/dnf")
    dnf clean all >/dev/null 2>&1 || true
    after=$(get_size "/var/cache/dnf")
    log_cleanup "DNF cache cleanup" "$before" "$after"
fi

echo "$(date '+%Y-%m-%d %H:%M:%S') - Cleanup process completed" >> "$LOG_FILE"
EOF

# Set script permissions
chmod 755 /opt/disk-monitor/*.sh
chown root:root /opt/disk-monitor/*.sh

# Create systemd service files
log_message "[5/8] Creating systemd services..."

cat > /etc/systemd/system/disk-monitor.service << EOF
[Unit]
Description=Disk Space Monitor
After=network.target

[Service]
Type=oneshot
ExecStart=/opt/disk-monitor/disk-check.sh
User=root
EOF

cat > /etc/systemd/system/disk-cleanup.service << EOF
[Unit]
Description=Disk Space Cleanup
After=network.target

[Service]
Type=oneshot
ExecStart=/opt/disk-monitor/cleanup.sh
User=root
EOF

# Create systemd timer files
log_message "[6/8] Creating systemd timers..."

cat > /etc/systemd/system/disk-monitor.timer << EOF
[Unit]
Description=Run disk monitor every 30 minutes
Requires=disk-monitor.service

[Timer]
OnCalendar=*:0/30
Persistent=true

[Install]
WantedBy=timers.target
EOF

cat > /etc/systemd/system/disk-cleanup.timer << EOF
[Unit]
Description=Run disk cleanup daily at 2 AM
Requires=disk-cleanup.service

[Timer]
OnCalendar=daily
RandomizedDelaySec=1800
Persistent=true

[Install]
WantedBy=timers.target
EOF

# Configure log rotation
log_message "[7/8] Configuring log rotation..."
cat > /etc/logrotate.d/disk-monitor << EOF
/var/log/disk-monitor/*.log {
    daily
    missingok
    rotate 30
    compress
    notifempty
    create 644 root root
}
EOF

# Start and enable services
systemctl daemon-reload
systemctl enable disk-monitor.timer
systemctl enable disk-cleanup.timer
systemctl start disk-monitor.timer
systemctl start disk-cleanup.timer

# Verification
log_message "[8/8] Verifying installation..."

if systemctl is-active --quiet disk-monitor.timer; then
    log_message "✓ Disk monitor timer is active"
else
    log_error "✗ Disk monitor timer failed to start"
    exit 1
fi

if systemctl is-active --quiet disk-cleanup.timer; then
    log_message "✓ Disk cleanup timer is active"
else
    log_error "✗ Disk cleanup timer failed to start"
    exit 1
fi

# Test monitoring script
if /opt/disk-monitor/disk-check.sh; then
    log_message "✓ Disk monitoring script works correctly"
else
    log_warning "⚠ Disk monitoring script test failed (mail configuration may be incomplete)"
fi

log_message "Installation completed successfully!"
log_message "Configuration:"
log_message "  Email recipient: $EMAIL_RECIPIENT"
log_message "  Warning threshold: $WARNING_THRESHOLD%"
log_message "  Critical threshold: $CRITICAL_THRESHOLD%"
log_message "  Monitor runs every 30 minutes"
log_message "  Cleanup runs daily at 2 AM"
log_message "  Logs: /var/log/disk-monitor/"

Review the script before running. Execute with: bash install.sh

#disk monitoring #systemd timers #automated cleanup #log rotation #email alerts

Configure Linux disk usage monitoring and automated cleanup with systemd timers

Prerequisites

What this solves

Step-by-step configuration

Update system packages

Install monitoring and mail utilities

Create disk monitoring script

Configuration

Function to log messages

Function to send email alert

Check disk usage for all mounted filesystems

Log successful completion

Make the monitoring script executable

Create cleanup script for temporary files

Function to log messages with space saved

Function to get directory size in KB

Clean temporary directories

Clean old log files (older than 30 days)

Clean package cache

Clean journal logs older than 30 days

Make cleanup script executable

Create systemd service for disk monitoring

Create systemd service for cleanup

Create systemd timers

Configure enhanced log rotation

Enhanced system log rotation

Configure Postfix for email notifications

Enable and start services

Create log directories and initial files

Verify your setup

List all active timers

Test the monitoring script manually

Test the cleanup script manually

Check log files were created

Verify current disk usage

Check postfix is running

Advanced configuration

Add filesystem-specific monitoring

Format: filesystem:warning_threshold:critical_threshold

Configure logrotate for application logs

Apache logs

Common issues

Monitoring and maintenance

Check monitoring logs

Check cleanup logs

Test logrotate manually

Force logrotate to run

Analyze disk usage with ncdu

Check journal disk usage

Next steps

Related tutorials

Configure automated system maintenance with advanced cron scheduling and shell scripts

Configure network-attached storage backup with NFS and encryption

Optimize systemd journal logging performance and storage

Don't want to manage this yourself?