Configure Linux container memory limits and monitoring with systemd and cgroups v2

Advanced 25 min Apr 17, 2026 11 views
Ubuntu 24.04 Ubuntu 22.04 Debian 12 AlmaLinux 9 Rocky Linux 9 Fedora 41

Set up memory limits and monitoring for containers using systemd services and cgroups v2 to prevent OOM kills and track resource usage.

Prerequisites

  • Root access to Linux server
  • systemd-based distribution
  • Basic understanding of Linux processes

What this solves

When containers consume excessive memory, they can crash your applications or bring down entire systems through out-of-memory (OOM) kills. This tutorial shows you how to configure memory limits using systemd and cgroups v2, set up monitoring for memory pressure events, and implement automated alerts when containers approach their limits.

Step-by-step configuration

Verify cgroups v2 is enabled

Check that your system is using cgroups v2, which provides better memory management features than v1.

mount | grep cgroup
stat -fc %T /sys/fs/cgroup

If the output shows cgroup2fs, you're using cgroups v2. If not, enable it by editing the kernel boot parameters.

Install required monitoring tools

Install the tools needed for memory monitoring and pressure notifications.

sudo apt update
sudo apt install -y systemd-cgtop stress-ng mailutils
sudo dnf update -y
sudo dnf install -y systemd stress-ng mailx

Create a test service with memory limits

Create a systemd service that demonstrates memory limiting and monitoring capabilities.

[Unit]
Description=Memory Test Service
After=network.target

[Service]
Type=simple
User=nobody
Group=nogroup
ExecStart=/usr/bin/stress-ng --vm 1 --vm-bytes 100M --timeout 0
Restart=always
RestartSec=5

Memory limits

MemoryMax=128M MemorySwapMax=0 MemoryHigh=100M

Enable memory accounting

MemoryAccounting=yes TasksAccounting=yes

OOM policy

OOMPolicy=stop [Install] WantedBy=multi-user.target

Configure memory pressure monitoring

Create a script that monitors memory pressure using cgroups v2 pressure stall information.

#!/bin/bash

Memory pressure monitoring script

SERVICE_NAME="memory-test" THRESHOLD=80 LOG_FILE="/var/log/memory-pressure.log" log_message() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE" } check_memory_usage() { local cgroup_path="/sys/fs/cgroup/system.slice/${SERVICE_NAME}.service" if [[ ! -d "$cgroup_path" ]]; then log_message "ERROR: Service $SERVICE_NAME not found or not running" return 1 fi # Read current memory usage local current_memory=$(cat "$cgroup_path/memory.current" 2>/dev/null || echo "0") local memory_max=$(cat "$cgroup_path/memory.max" 2>/dev/null || echo "0") if [[ "$memory_max" == "max" ]] || [[ "$memory_max" -eq 0 ]]; then log_message "WARNING: No memory limit set for $SERVICE_NAME" return 1 fi # Calculate usage percentage local usage_percent=$((current_memory * 100 / memory_max)) log_message "Memory usage: ${usage_percent}% (${current_memory} / ${memory_max} bytes)" # Check pressure stall information if [[ -f "$cgroup_path/memory.pressure" ]]; then local pressure_some=$(grep "some" "$cgroup_path/memory.pressure" | cut -d' ' -f2 | cut -d'=' -f2) local pressure_full=$(grep "full" "$cgroup_path/memory.pressure" | cut -d' ' -f4 | cut -d'=' -f2) log_message "Memory pressure - some: ${pressure_some}, full: ${pressure_full}" fi # Alert if usage exceeds threshold if [[ "$usage_percent" -gt "$THRESHOLD" ]]; then log_message "ALERT: Memory usage ${usage_percent}% exceeds threshold ${THRESHOLD}%" send_alert "$SERVICE_NAME" "$usage_percent" "$current_memory" "$memory_max" fi } send_alert() { local service="$1" local percent="$2" local current="$3" local max="$4" local message="Memory alert for service: $service Usage: $percent% ($current / $max bytes) Time: $(date)" # Log to syslog logger -p user.warning "Memory pressure alert: $service at $percent%" # Optionally send email (configure postfix/sendmail first) # echo "$message" | mail -s "Memory Alert: $service" admin@example.com }

Main execution

check_memory_usage
sudo chmod +x /usr/local/bin/memory-pressure-monitor.sh

Set up automated monitoring with systemd timer

Create a systemd timer to run the memory monitoring script every 30 seconds.

[Unit]
Description=Memory Pressure Monitor

[Service]
Type=oneshot
ExecStart=/usr/local/bin/memory-pressure-monitor.sh
User=root
Group=root
[Unit]
Description=Run memory pressure monitor every 30 seconds
Requires=memory-monitor.service

[Timer]
OnBootSec=30
OnUnitActiveSec=30
AccuracySec=5

[Install]
WantedBy=timers.target

Configure OOM notification script

Create a script that monitors for OOM kills using kernel messages and systemd events.

#!/bin/bash

OOM kill monitoring script

LOG_FILE="/var/log/oom-monitor.log" LAST_CHECK_FILE="/var/lib/oom-monitor-lastcheck" log_message() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE" } check_oom_events() { # Get timestamp of last check local last_check="$(cat "$LAST_CHECK_FILE" 2>/dev/null || echo "1 hour ago")" # Check kernel messages for OOM kills local oom_events=$(journalctl --since="$last_check" --grep="killed process" --output=json-pretty | jq -r '.MESSAGE // empty' 2>/dev/null) if [[ -n "$oom_events" ]]; then log_message "OOM kill detected:" echo "$oom_events" >> "$LOG_FILE" # Send alert logger -p user.crit "OOM kill detected - check $LOG_FILE for details" fi # Check for systemd service failures due to memory local failed_services=$(systemctl --failed --output=json | jq -r '.[] | select(.sub == "failed") | .unit' 2>/dev/null) for service in $failed_services; do local exit_status=$(systemctl show "$service" --property=ExecMainStatus --value) if [[ "$exit_status" == "137" ]] || [[ "$exit_status" == "143" ]]; then log_message "Service $service may have been OOM killed (exit status: $exit_status)" fi done # Update timestamp date > "$LAST_CHECK_FILE" }

Main execution

check_oom_events
sudo chmod +x /usr/local/bin/oom-monitor.sh
sudo mkdir -p /var/lib
sudo touch /var/lib/oom-monitor-lastcheck

Enable and start monitoring services

Enable all monitoring services and start the test service to demonstrate memory limiting.

sudo systemctl daemon-reload
sudo systemctl enable --now memory-monitor.timer
sudo systemctl enable --now memory-test.service

Create memory usage dashboard script

Build a script to display real-time memory usage for all systemd services with memory limits.

#!/bin/bash

Memory usage dashboard for systemd services

print_header() { printf "\n%-25s %-12s %-12s %-12s %-8s\n" "SERVICE" "CURRENT" "MAX" "HIGH" "USAGE%" printf "%-25s %-12s %-12s %-12s %-8s\n" "-------" "-------" "---" "----" "------" } format_bytes() { local bytes="$1" if [[ "$bytes" == "max" ]] || [[ "$bytes" -eq 0 ]]; then echo "unlimited" elif [[ "$bytes" -gt 1073741824 ]]; then printf "%.1fG" "$(echo "scale=1; $bytes/1073741824" | bc -l)" elif [[ "$bytes" -gt 1048576 ]]; then printf "%.1fM" "$(echo "scale=1; $bytes/1048576" | bc -l)" elif [[ "$bytes" -gt 1024 ]]; then printf "%.1fK" "$(echo "scale=1; $bytes/1024" | bc -l)" else echo "${bytes}B" fi } get_service_memory() { local service="$1" local cgroup_path="/sys/fs/cgroup/system.slice/${service}" if [[ ! -d "$cgroup_path" ]]; then return 1 fi local current=$(cat "$cgroup_path/memory.current" 2>/dev/null || echo "0") local max=$(cat "$cgroup_path/memory.max" 2>/dev/null || echo "max") local high=$(cat "$cgroup_path/memory.high" 2>/dev/null || echo "max") local usage_percent="N/A" if [[ "$max" != "max" ]] && [[ "$max" -gt 0 ]]; then usage_percent="$((current * 100 / max))%" fi printf "%-25s %-12s %-12s %-12s %-8s\n" \ "${service%.service}" \ "$(format_bytes "$current")" \ "$(format_bytes "$max")" \ "$(format_bytes "$high")" \ "$usage_percent" } print_header

Find all services with memory limits

for service in /sys/fs/cgroup/system.slice/*.service; do if [[ -f "$service/memory.max" ]]; then service_name="$(basename "$service")" get_service_memory "$service_name" fi done echo "" echo "Real-time monitoring: sudo systemctl status memory-monitor.timer" echo "View pressure logs: sudo tail -f /var/log/memory-pressure.log" echo "OOM monitoring logs: sudo tail -f /var/log/oom-monitor.log"
sudo chmod +x /usr/local/bin/memory-dashboard.sh

Configure container memory limits for production services

Apply memory limits to actual production services. This example shows configuration for a web application service.

[Unit]
Description=Web Application
After=network.target

[Service]
Type=notify
User=webapp
Group=webapp
ExecStart=/usr/local/bin/webapp-server
Restart=always
RestartSec=10

Memory management

MemoryAccounting=yes MemoryMax=512M MemoryHigh=400M MemorySwapMax=128M

When memory limits are hit

OOMPolicy=stop OOMScoreAdjust=100

CPU limits (optional)

CPUAccounting=yes CPUQuota=80%

Task limits

TasksAccounting=yes TasksMax=256 [Install] WantedBy=multi-user.target

Verify your setup

Test that memory limits and monitoring are working correctly.

# Check timer status
sudo systemctl status memory-monitor.timer

View current memory usage

sudo /usr/local/bin/memory-dashboard.sh

Check monitoring logs

sudo tail -f /var/log/memory-pressure.log

Test memory pressure (in another terminal)

sudo systemctl restart memory-test

Check cgroups v2 memory stats

sudo systemd-cgtop --memory

You should see memory usage statistics, pressure information, and alerts when services approach their limits.

Understanding cgroups v2 memory hierarchy

Memory limit types: MemoryHigh triggers reclaim before hitting hard limits, MemoryMax is the absolute ceiling, and MemorySwapMax controls swap usage separately.
SettingPurposeBehavior when exceeded
MemoryHighSoft limit for memory reclaimTriggers aggressive memory reclaim
MemoryMaxHard memory limitOOM kill or process termination
MemorySwapMaxMaximum swap usagePrevents swap thrashing
MemoryLowMemory protection thresholdProtects memory from reclaim

Implement advanced pressure notifications

Create pressure stall information (PSI) monitoring

Set up monitoring using the Linux PSI interface for more granular memory pressure detection.

#!/bin/bash

PSI Memory Monitor with thresholds

PSI_FILE="/proc/pressure/memory" THRESHOLD_SOME=50.0 THRESHOLD_FULL=10.0 LOG_FILE="/var/log/psi-memory.log" log_message() { echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE" } check_memory_pressure() { if [[ ! -f "$PSI_FILE" ]]; then log_message "ERROR: PSI not available on this system" return 1 fi # Parse PSI values local some_avg10=$(grep "some" "$PSI_FILE" | awk '{print $2}' | cut -d'=' -f2) local full_avg10=$(grep "full" "$PSI_FILE" | awk '{print $2}' | cut -d'=' -f2) log_message "Memory pressure - some avg10: ${some_avg10}%, full avg10: ${full_avg10}%" # Check thresholds if (( $(echo "$some_avg10 > $THRESHOLD_SOME" | bc -l) )); then log_message "WARNING: Memory pressure 'some' exceeds threshold: ${some_avg10}% > ${THRESHOLD_SOME}%" systemctl --user --global status || systemctl status | head -20 >> "$LOG_FILE" fi if (( $(echo "$full_avg10 > $THRESHOLD_FULL" | bc -l) )); then log_message "CRITICAL: Memory pressure 'full' exceeds threshold: ${full_avg10}% > ${THRESHOLD_FULL}%" # Emergency actions could go here fi } check_memory_pressure
sudo chmod +x /usr/local/bin/psi-monitor.sh

Troubleshoot OOM kills and memory leaks

Set up comprehensive OOM analysis

Create tools to analyze and prevent out-of-memory conditions.

#!/bin/bash

OOM Analysis and Prevention Tool

REPORT_FILE="/var/log/oom-analysis.log" generate_oom_report() { echo "=== OOM Analysis Report $(date) ===" >> "$REPORT_FILE" # Recent OOM kills from kernel echo "\n--- Recent OOM Kills ---" >> "$REPORT_FILE" journalctl --since="24 hours ago" --grep="killed process" >> "$REPORT_FILE" # Memory usage by cgroup echo "\n--- Memory Usage by Service ---" >> "$REPORT_FILE" for service in /sys/fs/cgroup/system.slice/*.service; do if [[ -f "$service/memory.current" ]] && [[ -f "$service/memory.max" ]]; then local name=$(basename "$service") local current=$(cat "$service/memory.current") local max=$(cat "$service/memory.max") if [[ "$max" != "max" ]] && [[ "$current" -gt 0 ]]; then local percent=$((current * 100 / max)) printf "%-30s %10s / %10s (%3d%%)\n" "$name" "$(numfmt --to=iec "$current")" "$(numfmt --to=iec "$max")" "$percent" >> "$REPORT_FILE" fi fi done # System memory info echo "\n--- System Memory ---" >> "$REPORT_FILE" free -h >> "$REPORT_FILE" # Top memory consumers echo "\n--- Top Memory Consumers ---" >> "$REPORT_FILE" ps aux --sort=-%mem | head -10 >> "$REPORT_FILE" echo "\n" >> "$REPORT_FILE" } generate_oom_report echo "OOM analysis complete. Report saved to $REPORT_FILE"
sudo chmod +x /usr/local/bin/oom-analysis.sh

Common issues

SymptomCauseFix
Service keeps getting OOM killedMemory limit too low for workloadIncrease MemoryMax or optimize application memory usage
Memory pressure alerts not triggeringMonitoring script permissions or timer not activesudo systemctl status memory-monitor.timer and check logs
cgroups v2 not availableSystem using cgroups v1Add systemd.unified_cgroup_hierarchy=1 to kernel boot parameters
PSI interface missingKernel compiled without PSI supportEnable PSI in kernel config or use alternative monitoring
High memory usage but no reclaimMemoryHigh set too highLower MemoryHigh to trigger earlier reclaim

Next steps

Running this in production?

Want this handled for you? Running this at scale adds a second layer of work: capacity planning, failover drills, cost control, and on-call. See how we run infrastructure like this for European teams.

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle managed cloud infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.