Set up comprehensive automated system maintenance using advanced cron scheduling patterns, custom shell scripts, and monitoring alerts to ensure optimal server performance and reliability.
Prerequisites
- Root or sudo access
- Basic shell scripting knowledge
- Understanding of Linux system administration
- Email server access for alerts
What this solves
Manual system maintenance is time-consuming and error-prone. This tutorial shows you how to automate critical maintenance tasks like log rotation, disk cleanup, security updates, and system monitoring using advanced cron scheduling and robust shell scripts. You'll create a maintenance framework that runs reliably across multiple environments.
System maintenance fundamentals and planning
Effective system maintenance automation requires understanding task priorities, scheduling patterns, and failure handling. Critical tasks like security updates need different scheduling than routine cleanup operations.
Create maintenance directory structure
Organize maintenance scripts and logs in a dedicated directory structure with proper permissions.
sudo mkdir -p /opt/maintenance/{scripts,logs,config}
sudo mkdir -p /opt/maintenance/scripts/{daily,weekly,monthly}
sudo chmod 755 /opt/maintenance
sudo chmod 750 /opt/maintenance/scripts
Set up maintenance user and permissions
Create a dedicated user for maintenance tasks to improve security and logging.
sudo useradd -r -s /bin/bash -d /opt/maintenance -c "System Maintenance" maintenance
sudo chown -R maintenance:maintenance /opt/maintenance
sudo usermod -aG adm maintenance
Create maintenance configuration file
Define global settings for all maintenance scripts in a central configuration file.
# System Maintenance Configuration
MAINT_LOG_DIR="/opt/maintenance/logs"
MAINT_SCRIPT_DIR="/opt/maintenance/scripts"
MAINT_USER="maintenance"
MAINT_GROUP="maintenance"
Email settings
ALERT_EMAIL="admin@example.com"
SMTP_SERVER="localhost"
Retention policies
LOG_RETENTION_DAYS=30
BACKUP_RETENTION_DAYS=7
TEMP_CLEANUP_DAYS=7
System thresholds
DISK_USAGE_THRESHOLD=85
MEMORY_USAGE_THRESHOLD=90
LOAD_AVERAGE_THRESHOLD=5.0
Advanced cron scheduling configuration
Install and configure cron service
Ensure cron is installed and running with proper logging enabled.
sudo apt update
sudo apt install -y cron rsyslog
sudo systemctl enable --now cron
sudo systemctl enable --now rsyslog
Configure cron logging
Enable detailed cron logging to track job execution and troubleshoot issues.
# Cron logging configuration
cron.* /var/log/cron.log
Separate maintenance cron jobs
:programname, isequal, "CRON" /var/log/maintenance-cron.log
& stop
sudo systemctl restart rsyslog
sudo touch /var/log/cron.log /var/log/maintenance-cron.log
sudo chown syslog:adm /var/log/cron.log /var/log/maintenance-cron.log
Create advanced cron schedule patterns
Set up sophisticated scheduling patterns for different maintenance tasks using cron expressions and systemd timers.
sudo -u maintenance crontab -e
# Advanced Cron Scheduling for System Maintenance
Format: minute hour day month weekday command
Daily maintenance at 2:15 AM
15 2 * /opt/maintenance/scripts/daily/daily-maintenance.sh >> /opt/maintenance/logs/daily.log 2>&1
Weekly maintenance on Sundays at 3:30 AM
30 3 0 /opt/maintenance/scripts/weekly/weekly-maintenance.sh >> /opt/maintenance/logs/weekly.log 2>&1
Monthly maintenance on the 1st at 4:00 AM
0 4 1 /opt/maintenance/scripts/monthly/monthly-maintenance.sh >> /opt/maintenance/logs/monthly.log 2>&1
Security updates check every 6 hours
0 /6 /opt/maintenance/scripts/daily/security-updates.sh >> /opt/maintenance/logs/security.log 2>&1
Disk space monitoring every 30 minutes during business hours
/30 9-17 * 1-5 /opt/maintenance/scripts/daily/disk-monitor.sh >> /opt/maintenance/logs/disk-monitor.log 2>&1
System health check every 5 minutes
/5 * /opt/maintenance/scripts/daily/health-check.sh >> /opt/maintenance/logs/health.log 2>&1
Log rotation at midnight
0 0 * /opt/maintenance/scripts/daily/log-rotation.sh >> /opt/maintenance/logs/log-rotation.log 2>&1
Configure systemd timers for critical tasks
Use systemd timers for more reliable scheduling of critical maintenance tasks with better logging and dependency management.
[Unit]
Description=Daily System Maintenance
Wants=network-online.target
After=network-online.target
[Service]
Type=oneshot
User=maintenance
Group=maintenance
WorkingDirectory=/opt/maintenance
ExecStart=/opt/maintenance/scripts/daily/daily-maintenance.sh
StandardOutput=append:/opt/maintenance/logs/systemd-daily.log
StandardError=append:/opt/maintenance/logs/systemd-daily.log
[Unit]
Description=Run daily system maintenance
Requires=system-maintenance.service
[Timer]
OnCalendar=daily
RandomizedDelaySec=30min
Persistent=true
[Install]
WantedBy=timers.target
sudo systemctl daemon-reload
sudo systemctl enable --now system-maintenance.timer
System maintenance script development
Create master daily maintenance script
Develop a comprehensive daily maintenance script with error handling, logging, and modular task execution.
#!/bin/bash
Daily System Maintenance Script
Author: System Administrator
Version: 1.0
set -euo pipefail
Source configuration
source /opt/maintenance/config/maintenance.conf
Logging function
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$MAINT_LOG_DIR/daily-$(date +%Y%m%d).log"
}
Error handling
error_handler() {
local line_no=$1
log "ERROR: Script failed at line $line_no"
echo "Daily maintenance failed at $(date)" | mail -s "Maintenance Alert: $(hostname)" "$ALERT_EMAIL"
exit 1
}
trap 'error_handler ${LINENO}' ERR
log "Starting daily maintenance on $(hostname)"
Update package lists
log "Updating package lists"
if command -v apt >/dev/null 2>&1; then
sudo apt update -qq
elif command -v dnf >/dev/null 2>&1; then
sudo dnf check-update -q || true
fi
Clean temporary files older than threshold
log "Cleaning temporary files older than $TEMP_CLEANUP_DAYS days"
find /tmp -type f -mtime +$TEMP_CLEANUP_DAYS -delete 2>/dev/null || true
find /var/tmp -type f -mtime +$TEMP_CLEANUP_DAYS -delete 2>/dev/null || true
Clean old log files
log "Rotating and compressing old logs"
find /var/log -name "*.log" -mtime +1 -exec gzip {} \;
find /var/log -name "*.gz" -mtime +$LOG_RETENTION_DAYS -delete
Check disk usage
log "Checking disk usage"
disk_usage=$(df / | awk 'NR==2 {print $5}' | sed 's/%//')
if [ "$disk_usage" -gt "$DISK_USAGE_THRESHOLD" ]; then
log "WARNING: Disk usage at $disk_usage%"
echo "Disk usage critical on $(hostname): $disk_usage%" | mail -s "Disk Alert: $(hostname)" "$ALERT_EMAIL"
fi
Update file database
log "Updating locate database"
sudo updatedb
Clean package caches
log "Cleaning package caches"
if command -v apt >/dev/null 2>&1; then
sudo apt autoremove -y -qq
sudo apt autoclean -qq
elif command -v dnf >/dev/null 2>&1; then
sudo dnf autoremove -y -q
sudo dnf clean packages -q
fi
Generate system summary
log "Generating system summary"
echo "=== Daily System Summary for $(hostname) ===" > "$MAINT_LOG_DIR/summary-$(date +%Y%m%d).txt"
echo "Date: $(date)" >> "$MAINT_LOG_DIR/summary-$(date +%Y%m%d).txt"
echo "Uptime: $(uptime)" >> "$MAINT_LOG_DIR/summary-$(date +%Y%m%d).txt"
echo "Disk Usage: $(df -h / | awk 'NR==2 {print $5}')" >> "$MAINT_LOG_DIR/summary-$(date +%Y%m%d).txt"
echo "Memory Usage: $(free -m | awk 'NR==2{printf "%.1f%%", $3*100/$2}')" >> "$MAINT_LOG_DIR/summary-$(date +%Y%m%d).txt"
echo "Load Average: $(cat /proc/loadavg | cut -d' ' -f1-3)" >> "$MAINT_LOG_DIR/summary-$(date +%Y%m%d).txt"
log "Daily maintenance completed successfully"
Cleanup old maintenance logs
find "$MAINT_LOG_DIR" -name "daily-*.log" -mtime +$LOG_RETENTION_DAYS -delete
find "$MAINT_LOG_DIR" -name "summary-*.txt" -mtime +$LOG_RETENTION_DAYS -delete
chmod 750 /opt/maintenance/scripts/daily/daily-maintenance.sh
sudo chown maintenance:maintenance /opt/maintenance/scripts/daily/daily-maintenance.sh
Create security updates monitoring script
Develop a script to monitor and optionally install critical security updates automatically.
#!/bin/bash
Security Updates Monitoring Script
set -euo pipefail
source /opt/maintenance/config/maintenance.conf
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$MAINT_LOG_DIR/security-$(date +%Y%m%d).log"
}
log "Checking for security updates"
if command -v apt >/dev/null 2>&1; then
# Ubuntu/Debian security updates
security_updates=$(apt list --upgradable 2>/dev/null | grep -E '(security|Ubuntu.*-security)' | wc -l)
if [ "$security_updates" -gt 0 ]; then
log "Found $security_updates security updates available"
# List available security updates
apt list --upgradable 2>/dev/null | grep -E '(security|Ubuntu.*-security)' > "$MAINT_LOG_DIR/security-updates-$(date +%Y%m%d).txt"
# Install unattended-upgrades if not present
if ! dpkg -l | grep -q unattended-upgrades; then
log "Installing unattended-upgrades"
sudo apt install -y unattended-upgrades
fi
# Configure automatic security updates
echo 'Unattended-Upgrade::Allowed-Origins {
"${distro_id}:${distro_codename}-security";
"${distro_id} ESMApps:${distro_codename}-apps-security";
"${distro_id} ESM:${distro_codename}-infra-security";
};' | sudo tee /etc/apt/apt.conf.d/50unattended-upgrades
# Send alert
echo "Security updates available on $(hostname): $security_updates updates" | mail -s "Security Alert: $(hostname)" "$ALERT_EMAIL"
else
log "No security updates available"
fi
elif command -v dnf >/dev/null 2>&1; then
# RHEL/CentOS/AlmaLinux/Rocky security updates
security_updates=$(dnf --security check-update 2>/dev/null | wc -l)
if [ "$security_updates" -gt 0 ]; then
log "Found $security_updates security updates available"
# List available security updates
dnf --security check-update > "$MAINT_LOG_DIR/security-updates-$(date +%Y%m%d).txt" 2>/dev/null
# Configure automatic security updates
if ! rpm -q dnf-automatic >/dev/null 2>&1; then
log "Installing dnf-automatic"
sudo dnf install -y dnf-automatic
fi
# Configure dnf-automatic for security updates only
sudo sed -i 's/upgrade_type = default/upgrade_type = security/' /etc/dnf/automatic.conf
sudo sed -i 's/apply_updates = no/apply_updates = yes/' /etc/dnf/automatic.conf
sudo systemctl enable --now dnf-automatic-install.timer
# Send alert
echo "Security updates available on $(hostname): $security_updates updates" | mail -s "Security Alert: $(hostname)" "$ALERT_EMAIL"
else
log "No security updates available"
fi
fi
log "Security update check completed"
chmod 750 /opt/maintenance/scripts/daily/security-updates.sh
sudo chown maintenance:maintenance /opt/maintenance/scripts/daily/security-updates.sh
Create system health monitoring script
Build a comprehensive health monitoring script that checks system resources and services.
#!/bin/bash
System Health Check Script
set -euo pipefail
source /opt/maintenance/config/maintenance.conf
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1"
}
Check disk usage
check_disk_usage() {
local usage
usage=$(df / | awk 'NR==2 {print $5}' | sed 's/%//')
if [ "$usage" -gt "$DISK_USAGE_THRESHOLD" ]; then
log "CRITICAL: Disk usage at $usage%"
echo "Disk usage critical on $(hostname): $usage%" | mail -s "Critical Alert: Disk Usage" "$ALERT_EMAIL"
return 1
fi
log "OK: Disk usage at $usage%"
return 0
}
Check memory usage
check_memory_usage() {
local mem_usage
mem_usage=$(free | awk 'NR==2{printf "%.0f", $3*100/$2}')
if [ "$mem_usage" -gt "$MEMORY_USAGE_THRESHOLD" ]; then
log "WARNING: Memory usage at $mem_usage%"
echo "High memory usage on $(hostname): $mem_usage%" | mail -s "Warning: Memory Usage" "$ALERT_EMAIL"
return 1
fi
log "OK: Memory usage at $mem_usage%"
return 0
}
Check load average
check_load_average() {
local load_avg
load_avg=$(cat /proc/loadavg | cut -d' ' -f1)
if (( $(echo "$load_avg > $LOAD_AVERAGE_THRESHOLD" | bc -l) )); then
log "WARNING: Load average at $load_avg"
echo "High load average on $(hostname): $load_avg" | mail -s "Warning: Load Average" "$ALERT_EMAIL"
return 1
fi
log "OK: Load average at $load_avg"
return 0
}
Check critical services
check_services() {
local services=("ssh" "cron" "rsyslog")
local failed_services=()
for service in "${services[@]}"; do
if ! systemctl is-active --quiet "$service"; then
failed_services+=("$service")
log "CRITICAL: Service $service is not running"
else
log "OK: Service $service is running"
fi
done
if [ ${#failed_services[@]} -gt 0 ]; then
echo "Critical services down on $(hostname): ${failed_services[*]}" | mail -s "Critical Alert: Services Down" "$ALERT_EMAIL"
return 1
fi
return 0
}
Check filesystem corruption
check_filesystem() {
local errors
errors=$(dmesg | grep -i "ext.*error" | tail -5)
if [ -n "$errors" ]; then
log "WARNING: Filesystem errors detected"
echo "Filesystem errors on $(hostname): $errors" | mail -s "Warning: Filesystem Errors" "$ALERT_EMAIL"
return 1
fi
log "OK: No filesystem errors detected"
return 0
}
Main health check execution
main() {
local exit_code=0
log "Starting system health check"
check_disk_usage || exit_code=1
check_memory_usage || exit_code=1
check_load_average || exit_code=1
check_services || exit_code=1
check_filesystem || exit_code=1
if [ $exit_code -eq 0 ]; then
log "System health check completed - All OK"
else
log "System health check completed - Issues detected"
fi
return $exit_code
}
main "$@"
chmod 750 /opt/maintenance/scripts/daily/health-check.sh
sudo chown maintenance:maintenance /opt/maintenance/scripts/daily/health-check.sh
Create weekly maintenance script
Build a weekly maintenance script for more intensive tasks like system updates and database optimization.
#!/bin/bash
Weekly System Maintenance Script
set -euo pipefail
source /opt/maintenance/config/maintenance.conf
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$MAINT_LOG_DIR/weekly-$(date +%Y%m%d).log"
}
log "Starting weekly maintenance on $(hostname)"
Full system update (non-security packages)
log "Performing full system update"
if command -v apt >/dev/null 2>&1; then
sudo apt update -q
sudo apt upgrade -y -q
sudo apt autoremove -y -q
sudo apt autoclean -q
elif command -v dnf >/dev/null 2>&1; then
sudo dnf upgrade -y -q
sudo dnf autoremove -y -q
sudo dnf clean all -q
fi
Defragment and optimize databases if present
if command -v mysql >/dev/null 2>&1; then
log "Optimizing MySQL databases"
mysql -e "FLUSH LOGS; OPTIMIZE TABLE mysql.general_log; OPTIMIZE TABLE mysql.slow_log;"
fi
Clean old kernels (keep last 2)
if command -v apt >/dev/null 2>&1; then
log "Cleaning old kernels"
sudo apt autoremove --purge -y
fi
Update SSL certificates
if command -v certbot >/dev/null 2>&1; then
log "Renewing SSL certificates"
sudo certbot renew --quiet
fi
Backup important configuration files
log "Backing up configuration files"
backup_dir="/opt/maintenance/backups/$(date +%Y%m%d)"
mkdir -p "$backup_dir"
tar -czf "$backup_dir/etc-backup.tar.gz" /etc/ 2>/dev/null || true
tar -czf "$backup_dir/opt-maintenance-backup.tar.gz" /opt/maintenance/ 2>/dev/null || true
Clean old backups
find /opt/maintenance/backups -type d -mtime +$BACKUP_RETENTION_DAYS -exec rm -rf {} \;
Generate weekly report
log "Generating weekly system report"
weekly_report="$MAINT_LOG_DIR/weekly-report-$(date +%Y%m%d).txt"
{
echo "=== Weekly System Report for $(hostname) ==="
echo "Report Date: $(date)"
echo ""
echo "System Information:"
uname -a
echo ""
echo "Disk Usage:"
df -h
echo ""
echo "Memory Information:"
free -h
echo ""
echo "Running Services:"
systemctl list-units --type=service --state=running --no-pager
echo ""
echo "Network Connections:"
ss -tuln
echo ""
echo "Recent Logins:"
last -10
} > "$weekly_report"
log "Weekly maintenance completed successfully"
Email weekly report
if command -v mail >/dev/null 2>&1; then
mail -s "Weekly Report: $(hostname)" "$ALERT_EMAIL" < "$weekly_report"
fi
chmod 750 /opt/maintenance/scripts/weekly/weekly-maintenance.sh
sudo chown maintenance:maintenance /opt/maintenance/scripts/weekly/weekly-maintenance.sh
mkdir -p /opt/maintenance/backups
Monitoring and alerting automation
Install and configure mail system
Set up a mail system for sending maintenance alerts and reports.
sudo apt install -y postfix mailutils
sudo dpkg-reconfigure postfix
Create maintenance monitoring dashboard script
Build a monitoring dashboard that aggregates maintenance status and generates alerts.
#!/bin/bash
Maintenance Monitoring Dashboard
set -euo pipefail
source /opt/maintenance/config/maintenance.conf
Function to check if maintenance tasks are running properly
check_maintenance_status() {
local status_file="$MAINT_LOG_DIR/maintenance-status.json"
local current_time=$(date +%s)
local alerts=()
# Check if daily maintenance ran in last 25 hours
if [ -f "$MAINT_LOG_DIR/daily-$(date +%Y%m%d).log" ]; then
local daily_time=$(stat -c %Y "$MAINT_LOG_DIR/daily-$(date +%Y%m%d).log")
local daily_age=$(( (current_time - daily_time) / 3600 ))
if [ $daily_age -gt 25 ]; then
alerts+=("Daily maintenance overdue: $daily_age hours")
fi
else
alerts+=("Daily maintenance log missing")
fi
# Check if health checks are running
local health_log="$MAINT_LOG_DIR/health.log"
if [ -f "$health_log" ]; then
local health_time=$(stat -c %Y "$health_log")
local health_age=$(( (current_time - health_time) / 60 ))
if [ $health_age -gt 10 ]; then
alerts+=("Health checks stopped: $health_age minutes ago")
fi
fi
# Check for critical alerts in logs
local critical_count=$(grep -c "CRITICAL" "$MAINT_LOG_DIR"/*.log 2>/dev/null || echo 0)
if [ "$critical_count" -gt 0 ]; then
alerts+=("$critical_count critical alerts in logs")
fi
# Generate status report
{
echo "{"
echo " \"hostname\": \"$(hostname)\","
echo " \"timestamp\": \"$(date -Iseconds)\","
echo " \"status\": \"$( [ ${#alerts[@]} -eq 0 ] && echo 'OK' || echo 'ALERT' )\","
echo " \"alerts\": ["
for i in "${!alerts[@]}"; do
echo -n " \"${alerts[i]}\""
[ $i -lt $((${#alerts[@]} - 1)) ] && echo "," || echo ""
done
echo " ],"
echo " \"disk_usage\": \"$(df / | awk 'NR==2 {print $5}')\","
echo " \"memory_usage\": \"$(free | awk 'NR==2{printf "%.1f%%", $3*100/$2}')\","
echo " \"load_average\": \"$(cat /proc/loadavg | cut -d' ' -f1-3)\","
echo " \"uptime\": \"$(uptime -p)\""
echo "}"
} > "$status_file"
# Send alerts if any issues found
if [ ${#alerts[@]} -gt 0 ]; then
{
echo "Maintenance alerts for $(hostname):"
echo ""
for alert in "${alerts[@]}"; do
echo "- $alert"
done
echo ""
echo "Current system status:"
cat "$status_file"
} | mail -s "Maintenance Alert: $(hostname)" "$ALERT_EMAIL"
fi
}
Function to cleanup old logs and reports
cleanup_maintenance_files() {
find "$MAINT_LOG_DIR" -name "*.log" -mtime +$LOG_RETENTION_DAYS -delete
find "$MAINT_LOG_DIR" -name "*.txt" -mtime +$LOG_RETENTION_DAYS -delete
find "$MAINT_LOG_DIR" -name "*.json" -mtime +7 -delete
}
Main execution
check_maintenance_status
cleanup_maintenance_files
echo "[$(date '+%Y-%m-%d %H:%M:%S')] Maintenance monitoring completed"
chmod 750 /opt/maintenance/scripts/daily/maintenance-monitor.sh
sudo chown maintenance:maintenance /opt/maintenance/scripts/daily/maintenance-monitor.sh
Set up maintenance monitoring cron job
Add monitoring script to cron to run every hour and detect maintenance issues.
sudo -u maintenance crontab -l > /tmp/maintenance-cron
echo "0 /opt/maintenance/scripts/daily/maintenance-monitor.sh >> /opt/maintenance/logs/monitor.log 2>&1" >> /tmp/maintenance-cron
sudo -u maintenance crontab /tmp/maintenance-cron
rm /tmp/maintenance-cron
Create maintenance status web dashboard
Build a simple web interface to view maintenance status and logs.
#!/bin/bash
Generate HTML status dashboard
source /opt/maintenance/config/maintenance.conf
html_file="/var/www/html/maintenance-status.html"
status_file="$MAINT_LOG_DIR/maintenance-status.json"
Create HTML dashboard
cat > "$html_file" << 'EOF'
System Maintenance Status
System Maintenance Dashboard
EOF
if [ -f "$status_file" ]; then
# Parse JSON and add to HTML
python3 << EOF >> "$html_file"
import json
import sys
with open('$status_file', 'r') as f:
data = json.load(f)
status_class = 'ok' if data['status'] == 'OK' else 'alert'
print(f"")
print(f"Host: {data['hostname']}
")
print(f"Status: {data['status']}
")
print(f"Last Updated: {data['timestamp']}
")
print(f"Uptime: {data['uptime']}
")
print(f"")
print("")
print("Metric Value ")
print(f"Disk Usage {data['disk_usage']} ")
print(f"Memory Usage {data['memory_usage']} ")
print(f"Load Average {data['load_average']} ")
print("
")
if data['alerts']:
print("Active Alerts:
")
print("")
for alert in data['alerts']:
print(f"- {alert}
")
print("
")
else:
print("No active alerts
")
EOF
else
echo "Status file not found
" >> "$html_file"
fi
echo "" >> "$html_file"
echo "Status dashboard updated: $html_file"
chmod 750 /opt/maintenance/scripts/web-status.sh
sudo chown maintenance:maintenance /opt/maintenance/scripts/web-status.sh
Verify your setup
# Check cron service status
sudo systemctl status cron
Verify maintenance user crontab
sudo -u maintenance crontab -l
Check systemd timer status
sudo systemctl status system-maintenance.timer
sudo systemctl list-timers | grep maintenance
Test maintenance scripts
sudo -u maintenance /opt/maintenance/scripts/daily/health-check.sh
sudo -u maintenance /opt/maintenance/scripts/daily/maintenance-monitor.sh
Check log files
ls -la /opt/maintenance/logs/
tail /opt/maintenance/logs/health.log
Verify mail system
echo "Test maintenance alert" | mail -s "Test Alert" root
Check maintenance status
cat /opt/maintenance/logs/maintenance-status.json
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| Cron jobs not running | Cron service stopped | sudo systemctl start cron |
| Permission denied errors | Wrong ownership or permissions | sudo chown -R maintenance:maintenance /opt/maintenance |
| Mail not sending | Postfix not configured | sudo dpkg-reconfigure postfix |
| Scripts fail silently | Missing error handling | Add set -euo pipefail to scripts |
| Disk space alerts spam | Threshold too low | Adjust DISK_USAGE_THRESHOLD in config |
| Log files growing too large | Missing log rotation | Configure logrotate for maintenance logs |
| Systemd timer not active | Timer not enabled | sudo systemctl enable system-maintenance.timer |
Next steps
- Configure centralized cron management with Ansible automation for managing maintenance across multiple servers
- Monitor cron jobs and systemd timers with Prometheus and Grafana for advanced monitoring
- Configure automated compliance scanning with OpenSCAP for security compliance
- Configure maintenance alerting with Slack and Teams integration for better notification management
- Set up maintenance database with PostgreSQL for historical tracking to store maintenance history
Running this in production?
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# System Maintenance Automation Installation Script
# Configures automated system maintenance with advanced cron scheduling
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
# Configuration
ALERT_EMAIL=${1:-"admin@example.com"}
MAINTENANCE_USER="maintenance"
MAINTENANCE_HOME="/opt/maintenance"
# Cleanup on error
cleanup() {
echo -e "${RED}Installation failed. Cleaning up...${NC}"
userdel -r "$MAINTENANCE_USER" 2>/dev/null || true
rm -rf "$MAINTENANCE_HOME" 2>/dev/null || true
exit 1
}
trap cleanup ERR
# Check if running as root
if [[ $EUID -ne 0 ]]; then
echo -e "${RED}This script must be run as root${NC}"
exit 1
fi
# Auto-detect distribution
if [ -f /etc/os-release ]; then
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_INSTALL="apt install -y"
CRON_SERVICE="cron"
;;
almalinux|rocky|centos|rhel|ol|fedora)
PKG_MGR="dnf"
PKG_INSTALL="dnf install -y"
CRON_SERVICE="crond"
;;
amzn)
PKG_MGR="yum"
PKG_INSTALL="yum install -y"
CRON_SERVICE="crond"
;;
*)
echo -e "${RED}Unsupported distribution: $ID${NC}"
exit 1
;;
esac
else
echo -e "${RED}Cannot detect distribution${NC}"
exit 1
fi
echo -e "${GREEN}Configuring automated system maintenance...${NC}"
# Step 1: Install required packages
echo -e "[1/8] Installing required packages..."
if [ "$PKG_MGR" = "apt" ]; then
apt update
$PKG_INSTALL cron rsyslog logrotate
else
$PKG_INSTALL cronie rsyslog logrotate
fi
# Step 2: Create directory structure
echo -e "[2/8] Creating maintenance directory structure..."
mkdir -p "$MAINTENANCE_HOME"/{scripts,logs,config}
mkdir -p "$MAINTENANCE_HOME"/scripts/{daily,weekly,monthly}
chmod 755 "$MAINTENANCE_HOME"
chmod 750 "$MAINTENANCE_HOME"/scripts
# Step 3: Create maintenance user
echo -e "[3/8] Creating maintenance user..."
if ! id "$MAINTENANCE_USER" &>/dev/null; then
useradd -r -s /bin/bash -d "$MAINTENANCE_HOME" -c "System Maintenance" "$MAINTENANCE_USER"
fi
chown -R "$MAINTENANCE_USER":"$MAINTENANCE_USER" "$MAINTENANCE_HOME"
usermod -aG adm "$MAINTENANCE_USER"
# Step 4: Create configuration file
echo -e "[4/8] Creating maintenance configuration..."
cat > "$MAINTENANCE_HOME/config/maintenance.conf" << EOF
# System Maintenance Configuration
MAINT_LOG_DIR="$MAINTENANCE_HOME/logs"
MAINT_SCRIPT_DIR="$MAINTENANCE_HOME/scripts"
MAINT_USER="$MAINTENANCE_USER"
MAINT_GROUP="$MAINTENANCE_USER"
# Email settings
ALERT_EMAIL="$ALERT_EMAIL"
SMTP_SERVER="localhost"
# Retention policies
LOG_RETENTION_DAYS=30
BACKUP_RETENTION_DAYS=7
TEMP_CLEANUP_DAYS=7
# System thresholds
DISK_USAGE_THRESHOLD=85
MEMORY_USAGE_THRESHOLD=90
LOAD_AVERAGE_THRESHOLD=5.0
EOF
# Step 5: Configure cron logging
echo -e "[5/8] Configuring cron logging..."
if [ "$PKG_MGR" = "apt" ]; then
echo "cron.* /var/log/cron.log" >> /etc/rsyslog.d/50-default.conf
else
echo "cron.* /var/log/cron.log" >> /etc/rsyslog.conf
fi
touch /var/log/cron.log /var/log/maintenance-cron.log
chown syslog:adm /var/log/cron.log /var/log/maintenance-cron.log 2>/dev/null || chown root:root /var/log/cron.log /var/log/maintenance-cron.log
# Step 6: Create maintenance scripts
echo -e "[6/8] Creating maintenance scripts..."
# Daily maintenance script
cat > "$MAINTENANCE_HOME/scripts/daily/daily-maintenance.sh" << 'EOF'
#!/usr/bin/env bash
set -euo pipefail
source /opt/maintenance/config/maintenance.conf
echo "$(date): Starting daily maintenance"
# Clean temporary files
find /tmp -type f -atime +$TEMP_CLEANUP_DAYS -delete 2>/dev/null || true
find /var/tmp -type f -atime +$TEMP_CLEANUP_DAYS -delete 2>/dev/null || true
# Clean old logs
find $MAINT_LOG_DIR -name "*.log" -mtime +$LOG_RETENTION_DAYS -delete 2>/dev/null || true
echo "$(date): Daily maintenance completed"
EOF
# Health check script
cat > "$MAINTENANCE_HOME/scripts/daily/health-check.sh" << 'EOF'
#!/usr/bin/env bash
set -euo pipefail
source /opt/maintenance/config/maintenance.conf
# Check disk usage
DISK_USAGE=$(df / | awk 'NR==2 {print $5}' | sed 's/%//')
if [ "$DISK_USAGE" -gt "$DISK_USAGE_THRESHOLD" ]; then
echo "$(date): WARNING - Disk usage at ${DISK_USAGE}%"
fi
# Check memory usage
MEM_USAGE=$(free | awk 'NR==2{printf "%.0f", $3*100/$2}')
if [ "$MEM_USAGE" -gt "$MEMORY_USAGE_THRESHOLD" ]; then
echo "$(date): WARNING - Memory usage at ${MEM_USAGE}%"
fi
# Check load average
LOAD_AVG=$(uptime | awk -F'load average:' '{ print $2 }' | cut -d, -f1 | sed 's/^[ \t]*//')
if (( $(echo "$LOAD_AVG > $LOAD_AVERAGE_THRESHOLD" | bc -l) )); then
echo "$(date): WARNING - High load average: $LOAD_AVG"
fi
EOF
# Make scripts executable
chmod 750 "$MAINTENANCE_HOME"/scripts/daily/*.sh
chown -R "$MAINTENANCE_USER":"$MAINTENANCE_USER" "$MAINTENANCE_HOME"
# Step 7: Configure cron jobs
echo -e "[7/8] Configuring cron jobs..."
sudo -u "$MAINTENANCE_USER" crontab - << EOF
# System Maintenance Cron Jobs
15 2 * * * $MAINTENANCE_HOME/scripts/daily/daily-maintenance.sh >> $MAINTENANCE_HOME/logs/daily.log 2>&1
*/5 * * * * $MAINTENANCE_HOME/scripts/daily/health-check.sh >> $MAINTENANCE_HOME/logs/health.log 2>&1
EOF
# Step 8: Start services
echo -e "[8/8] Starting services..."
systemctl enable --now rsyslog
systemctl enable --now "$CRON_SERVICE"
systemctl restart rsyslog
# Verification
echo -e "${GREEN}Verifying installation...${NC}"
if systemctl is-active --quiet "$CRON_SERVICE"; then
echo -e "${GREEN}✓ Cron service is running${NC}"
else
echo -e "${RED}✗ Cron service is not running${NC}"
exit 1
fi
if [ -d "$MAINTENANCE_HOME" ] && [ -f "$MAINTENANCE_HOME/config/maintenance.conf" ]; then
echo -e "${GREEN}✓ Maintenance directory structure created${NC}"
else
echo -e "${RED}✗ Maintenance directory structure missing${NC}"
exit 1
fi
if sudo -u "$MAINTENANCE_USER" crontab -l &>/dev/null; then
echo -e "${GREEN}✓ Cron jobs configured${NC}"
else
echo -e "${RED}✗ Cron jobs not configured${NC}"
exit 1
fi
echo -e "${GREEN}Installation completed successfully!${NC}"
echo -e "${YELLOW}Configuration details:${NC}"
echo "- Maintenance user: $MAINTENANCE_USER"
echo "- Maintenance directory: $MAINTENANCE_HOME"
echo "- Alert email: $ALERT_EMAIL"
echo "- Logs directory: $MAINTENANCE_HOME/logs"
echo ""
echo "Check cron logs with: sudo tail -f /var/log/cron.log"
echo "Check maintenance logs with: sudo tail -f $MAINTENANCE_HOME/logs/health.log"
Review the script before running. Execute with: bash install.sh