Set up cgroups v2 unified hierarchy with systemd to implement memory limits, isolation policies, and automated pressure responses for container workloads and system processes.
Prerequisites
- Root or sudo access
- systemd-based Linux distribution
- Basic understanding of Linux process management
What this solves
Memory cgroups v2 with systemd provides fine-grained control over memory allocation, enabling process isolation, preventing memory exhaustion attacks, and implementing resource quotas. This unified hierarchy approach replaces the fragmented cgroups v1 system with a cleaner interface for container orchestration and multi-tenant environments.
Step-by-step configuration
Enable cgroups v2 unified hierarchy
Modern distributions may still default to cgroups v1. Enable the unified v2 hierarchy by modifying the kernel command line.
sudo grep -q "systemd.unified_cgroup_hierarchy=1" /proc/cmdline || echo "Enabling cgroups v2"
sudo sed -i 's/GRUB_CMDLINE_LINUX="\([^"]*\)"/GRUB_CMDLINE_LINUX="\1 systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all"/' /etc/default/grub
Update bootloader configuration
Apply the kernel parameter changes and reboot to activate cgroups v2.
sudo update-grub
sudo reboot
Verify cgroups v2 activation
Confirm that the system is using the unified cgroups v2 hierarchy after reboot.
mount | grep cgroup2
cat /proc/cgroups | head -1
ls -la /sys/fs/cgroup/
cgroup2 mounted at /sys/fs/cgroup and files like memory.current, memory.max in the cgroup root.Install memory pressure monitoring tools
Install utilities for monitoring memory usage and pressure events within cgroups.
sudo apt update
sudo apt install -y systemd-cgroup-utils procps htop stress-ng
Create memory-limited systemd service
Configure a test service with memory limits to demonstrate cgroups v2 memory control.
[Unit]
Description=Memory Test Service
After=multi-user.target
[Service]
Type=simple
ExecStart=/usr/bin/stress-ng --vm 1 --vm-bytes 50M --timeout 300s
Restart=always
RestartSec=10
MemoryAccounting=yes
MemoryMax=100M
MemoryHigh=80M
MemorySwapMax=0
OOMPolicy=kill
[Install]
WantedBy=multi-user.target
Configure advanced memory limits
Create systemd override files for existing services to apply memory controls without modifying original service files.
sudo mkdir -p /etc/systemd/system/nginx.service.d
sudo tee /etc/systemd/system/nginx.service.d/memory-limits.conf
[Service]
MemoryAccounting=yes
MemoryMax=512M
MemoryHigh=400M
MemorySwapMax=0
OOMPolicy=continue
Set up user session memory limits
Configure memory limits for user sessions to prevent individual users from consuming excessive system memory.
sudo mkdir -p /etc/systemd/system/user@.service.d
sudo tee /etc/systemd/system/user@.service.d/memory-limits.conf << 'EOF'
[Service]
MemoryAccounting=yes
MemoryMax=2G
MemoryHigh=1.5G
Delegate=yes
EOF
Create memory pressure notification script
Implement automated responses to memory pressure events using systemd and cgroups v2 pressure stall information.
sudo tee /usr/local/bin/memory-pressure-handler.sh << 'EOF'
#!/bin/bash
CGROUP_PATH="/sys/fs/cgroup/system.slice"
PRESSURE_THRESHOLD=10.0
LOG_FILE="/var/log/memory-pressure.log"
log_event() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
}
check_memory_pressure() {
if [[ -f "$CGROUP_PATH/memory.pressure" ]]; then
local some_avg10=$(awk '/some/ {print $2}' "$CGROUP_PATH/memory.pressure" | cut -d'=' -f2)
if (( $(echo "$some_avg10 > $PRESSURE_THRESHOLD" | bc -l) )); then
log_event "High memory pressure detected: $some_avg10%"
# Trigger cleanup actions
systemctl reload nginx 2>/dev/null || true
echo 3 > /proc/sys/vm/drop_caches
return 1
fi
fi
return 0
}
check_memory_pressure
EOF
sudo chmod +x /usr/local/bin/memory-pressure-handler.sh
Configure memory pressure monitoring service
Create a systemd timer to regularly check memory pressure and trigger automated responses.
[Unit]
Description=Memory Pressure Monitor
After=multi-user.target
[Service]
Type=oneshot
ExecStart=/usr/local/bin/memory-pressure-handler.sh
User=root
StandardOutput=journal
StandardError=journal
[Unit]
Description=Memory Pressure Monitor Timer
Requires=memory-pressure-monitor.service
[Timer]
OnBootSec=60
OnUnitActiveSec=30
Persistent=true
[Install]
WantedBy=timers.target
Enable memory controllers for container workloads
Configure cgroups delegation for container runtimes like Docker or Podman.
sudo mkdir -p /etc/systemd/system.conf.d
sudo tee /etc/systemd/system.conf.d/delegate.conf << 'EOF'
[Manager]
DefaultMemoryAccounting=yes
DefaultCPUAccounting=yes
DefaultIOAccounting=yes
EOF
Configure memory swap controls
Set up swap limitations and memory-only constraints for critical services.
sudo mkdir -p /etc/systemd/system/critical-app.service.d
sudo tee /etc/systemd/system/critical-app.service.d/memory-controls.conf << 'EOF'
[Service]
MemoryAccounting=yes
MemoryMax=1G
MemoryHigh=800M
MemorySwapMax=0
MemoryZSwapMax=0
OOMPolicy=kill
EOF
Reload systemd and enable services
Apply all configuration changes and start the monitoring services.
sudo systemctl daemon-reload
sudo systemctl enable --now memory-pressure-monitor.timer
sudo systemctl enable --now memory-test.service
sudo systemctl status memory-pressure-monitor.timer
Monitor memory cgroup usage
Use these commands to inspect memory usage and limits across your cgroups v2 hierarchy.
# View system-wide cgroup memory usage
sudo systemd-cgtop --iterations=1
Check specific service memory consumption
sudo systemctl status memory-test.service
sudo cat /sys/fs/cgroup/system.slice/memory-test.service/memory.current
sudo cat /sys/fs/cgroup/system.slice/memory-test.service/memory.max
Monitor memory pressure events
sudo cat /sys/fs/cgroup/system.slice/memory.pressure
sudo journalctl -u memory-pressure-monitor.service -f
Advanced memory isolation policies
Implement sophisticated memory management for multi-tenant environments and container orchestration.
Create hierarchical memory limits
Set up nested cgroup limits for complex application stacks.
[Unit]
Description=Web Application Stack
After=slices.target
[Slice]
MemoryAccounting=yes
MemoryMax=4G
MemoryHigh=3G
CPUAccounting=yes
CPUQuota=200%
sudo systemctl daemon-reload
sudo systemctl enable web-stack.slice
Configure OOM killer policies
Customize out-of-memory handling for different service classes.
sudo mkdir -p /etc/systemd/system/database.service.d
sudo tee /etc/systemd/system/database.service.d/oom-policy.conf << 'EOF'
[Service]
OOMPolicy=continue
OOMScoreAdjust=-500
EOF
Set up memory reclaim policies
Configure proactive memory reclaim when approaching limits.
sudo tee /usr/local/bin/memory-reclaim.sh << 'EOF'
#!/bin/bash
CGROUP="$1"
THRESHOLD="$2"
if [[ -z "$CGROUP" || -z "$THRESHOLD" ]]; then
echo "Usage: $0 "
exit 1
fi
CURRENT=$(cat "$CGROUP/memory.current")
MAX=$(cat "$CGROUP/memory.max")
if [[ "$MAX" != "max" ]]; then
USAGE_PERCENT=$(( CURRENT * 100 / MAX ))
if [[ $USAGE_PERCENT -gt $THRESHOLD ]]; then
echo 1 > "$CGROUP/memory.reclaim" 2>/dev/null || true
logger "Memory reclaim triggered for $CGROUP at $USAGE_PERCENT% usage"
fi
fi
EOF
sudo chmod +x /usr/local/bin/memory-reclaim.sh
Troubleshoot memory cgroup issues
Debug common memory limit and pressure problems with these diagnostic commands.
# Check for OOM kills in journals
sudo journalctl --since="1 hour ago" | grep -i "killed process\|out of memory\|oom"
Verify cgroup v2 features are available
cat /sys/fs/cgroup/cgroup.controllers
cat /sys/fs/cgroup/cgroup.subtree_control
Monitor memory events
sudo cat /sys/fs/cgroup/system.slice/memory.events
Test memory pressure with stress tool
sudo systemd-run --uid=1000 --gid=1000 --property=MemoryMax=50M stress-ng --vm 1 --vm-bytes 100M --timeout 10s
Verify your setup
# Confirm cgroups v2 is active
mount | grep cgroup2
cat /proc/cmdline | grep -o "systemd.unified_cgroup_hierarchy=1"
Check memory accounting is enabled
sudo systemctl show memory-test.service | grep MemoryAccounting
sudo systemctl show memory-test.service | grep MemoryMax
Verify pressure monitoring is running
sudo systemctl status memory-pressure-monitor.timer
sudo systemctl list-timers | grep memory-pressure
Test memory limits are enforced
sudo systemctl status memory-test.service
sudo systemd-cgtop --iterations=1
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| cgroup2 not mounted | Still using cgroups v1 | Add kernel parameters and reboot |
| Memory limits not enforced | MemoryAccounting disabled | Set MemoryAccounting=yes in service |
| Services killed by OOM | Memory limit too restrictive | Increase MemoryMax or optimize application |
| Pressure events not triggering | Monitoring script permissions | Ensure script is executable and runs as root |
| Container memory limits ignored | No delegation configured | Enable delegation in systemd configuration |
| Swap still used despite MemorySwapMax=0 | System swap not disabled | Check /proc/swaps and consider swapoff |
Next steps
- Implement Linux memory cgroups for container workload isolation
- Configure Linux process monitoring with top, htop, and btop for system performance analysis
- Configure Kubernetes resource quotas for namespace isolation
- Set up systemd memory pressure alerting with Prometheus
- Implement container memory limits with Podman and systemd
Running this in production?
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Production-ready cgroups v2 memory configuration installer
# Supports Ubuntu, Debian, AlmaLinux, Rocky Linux, CentOS, RHEL, Fedora
readonly SCRIPT_NAME="$(basename "$0")"
readonly LOG_FILE="/var/log/cgroups-v2-install.log"
readonly BACKUP_DIR="/root/cgroups-v2-backup-$(date +%Y%m%d-%H%M%S)"
# Colors
readonly RED='\033[0;31m'
readonly GREEN='\033[0;32m'
readonly YELLOW='\033[1;33m'
readonly BLUE='\033[0;34m'
readonly NC='\033[0m'
# Cleanup on failure
cleanup() {
echo -e "${RED}[ERROR] Installation failed. Check $LOG_FILE for details${NC}" >&2
if [[ -d "$BACKUP_DIR" ]]; then
echo -e "${YELLOW}Restore original files from: $BACKUP_DIR${NC}" >&2
fi
exit 1
}
trap cleanup ERR
log() {
echo "$(date '+%Y-%m-%d %H:%M:%S') $*" | tee -a "$LOG_FILE"
}
print_status() {
echo -e "${BLUE}$*${NC}"
log "$*"
}
print_success() {
echo -e "${GREEN}$*${NC}"
log "$*"
}
print_warning() {
echo -e "${YELLOW}$*${NC}"
log "WARNING: $*"
}
print_error() {
echo -e "${RED}$*${NC}" >&2
log "ERROR: $*"
}
usage() {
cat << EOF
Usage: $SCRIPT_NAME [OPTIONS]
Configure Linux memory cgroups v2 with systemd for process isolation and resource control.
OPTIONS:
-h, --help Show this help message
-p, --pressure Memory pressure threshold (default: 80.0)
--skip-reboot Skip automatic reboot (requires manual reboot later)
EXAMPLES:
$SCRIPT_NAME
$SCRIPT_NAME --pressure 70.0
$SCRIPT_NAME --skip-reboot
EOF
}
# Parse arguments
PRESSURE_THRESHOLD="80.0"
SKIP_REBOOT=false
while [[ $# -gt 0 ]]; do
case $1 in
-h|--help)
usage
exit 0
;;
-p|--pressure)
PRESSURE_THRESHOLD="$2"
shift 2
;;
--skip-reboot)
SKIP_REBOOT=true
shift
;;
*)
echo "Unknown option: $1" >&2
usage
exit 1
;;
esac
done
# Check prerequisites
check_prerequisites() {
print_status "[1/10] Checking prerequisites..."
if [[ $EUID -ne 0 ]]; then
print_error "This script must be run as root"
exit 1
fi
if ! command -v systemctl &> /dev/null; then
print_error "systemd is required but not found"
exit 1
fi
mkdir -p "$BACKUP_DIR"
touch "$LOG_FILE"
chmod 640 "$LOG_FILE"
}
# Detect distribution and package manager
detect_distro() {
print_status "[2/10] Detecting distribution..."
if [[ ! -f /etc/os-release ]]; then
print_error "Cannot detect distribution: /etc/os-release not found"
exit 1
fi
source /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_UPDATE="apt update"
PKG_INSTALL="apt install -y"
GRUB_UPDATE="update-grub"
GRUB_CONFIG="/etc/default/grub"
;;
almalinux|rocky|centos|rhel|ol|fedora)
PKG_MGR="dnf"
PKG_UPDATE="dnf check-update || true"
PKG_INSTALL="dnf install -y"
GRUB_UPDATE="grub2-mkconfig -o /boot/grub2/grub.cfg"
GRUB_CONFIG="/etc/default/grub"
;;
amzn)
PKG_MGR="yum"
PKG_UPDATE="yum check-update || true"
PKG_INSTALL="yum install -y"
GRUB_UPDATE="grub2-mkconfig -o /boot/grub2/grub.cfg"
GRUB_CONFIG="/etc/default/grub"
;;
*)
print_error "Unsupported distribution: $ID"
exit 1
;;
esac
print_success "Detected: $PRETTY_NAME ($PKG_MGR)"
}
# Check if cgroups v2 is already enabled
check_cgroups_status() {
print_status "[3/10] Checking current cgroups configuration..."
if mount | grep -q "cgroup2.*rw.*nsdelegate"; then
print_success "cgroups v2 already enabled"
return 0
fi
return 1
}
# Enable cgroups v2
enable_cgroups_v2() {
print_status "[4/10] Enabling cgroups v2 unified hierarchy..."
# Backup GRUB config
cp "$GRUB_CONFIG" "$BACKUP_DIR/"
if ! grep -q "systemd.unified_cgroup_hierarchy=1" "$GRUB_CONFIG"; then
sed -i.bak 's/GRUB_CMDLINE_LINUX="\([^"]*\)"/GRUB_CMDLINE_LINUX="\1 systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all"/' "$GRUB_CONFIG"
print_success "Added cgroups v2 kernel parameters"
else
print_warning "cgroups v2 parameters already present in GRUB config"
fi
# Update bootloader
print_status "Updating bootloader configuration..."
$GRUB_UPDATE
}
# Install required packages
install_packages() {
print_status "[5/10] Installing monitoring tools..."
$PKG_UPDATE
# Base packages for all distributions
PACKAGES="procps htop bc"
# Distribution-specific packages
case "$PKG_MGR" in
apt)
PACKAGES="$PACKAGES systemd-cgroup-utils stress-ng"
;;
dnf|yum)
PACKAGES="$PACKAGES systemd stress-ng"
;;
esac
$PKG_INSTALL $PACKAGES
print_success "Installed monitoring tools"
}
# Create memory test service
create_test_service() {
print_status "[6/10] Creating memory test service..."
cat > /etc/systemd/system/memory-test.service << 'EOF'
[Unit]
Description=Memory Test Service
After=multi-user.target
[Service]
Type=simple
ExecStart=/usr/bin/stress-ng --vm 1 --vm-bytes 50M --timeout 300s
Restart=no
MemoryAccounting=yes
MemoryMax=100M
MemoryHigh=80M
MemorySwapMax=0
OOMPolicy=kill
[Install]
WantedBy=multi-user.target
EOF
chmod 644 /etc/systemd/system/memory-test.service
systemctl daemon-reload
print_success "Created memory test service"
}
# Configure user session limits
configure_user_limits() {
print_status "[7/10] Configuring user session memory limits..."
mkdir -p /etc/systemd/system/user@.service.d
cat > /etc/systemd/system/user@.service.d/memory-limits.conf << 'EOF'
[Service]
MemoryAccounting=yes
MemoryMax=2G
MemoryHigh=1.5G
MemorySwapMax=512M
EOF
chmod 644 /etc/systemd/system/user@.service.d/memory-limits.conf
systemctl daemon-reload
print_success "Configured user session memory limits"
}
# Create memory pressure handler
create_pressure_handler() {
print_status "[8/10] Creating memory pressure monitoring..."
cat > /usr/local/bin/memory-pressure-handler.sh << EOF
#!/usr/bin/env bash
set -euo pipefail
readonly CGROUP_PATH="/sys/fs/cgroup"
readonly PRESSURE_THRESHOLD="$PRESSURE_THRESHOLD"
readonly LOG_FILE="$LOG_FILE"
log_event() {
echo "\$(date '+%Y-%m-%d %H:%M:%S') \$*" >> "\$LOG_FILE"
}
check_memory_pressure() {
if [[ -f "\$CGROUP_PATH/memory.pressure" ]]; then
local some_avg10=\$(awk '/some/ {print \$2}' "\$CGROUP_PATH/memory.pressure" | cut -d'=' -f2)
if (( \$(echo "\$some_avg10 > \$PRESSURE_THRESHOLD" | bc -l) )); then
log_event "High memory pressure detected: \$some_avg10%"
echo 1 > /proc/sys/vm/drop_caches 2>/dev/null || true
return 1
fi
fi
return 0
}
check_memory_pressure
EOF
chmod 755 /usr/local/bin/memory-pressure-handler.sh
# Create monitoring service and timer
cat > /etc/systemd/system/memory-pressure-monitor.service << 'EOF'
[Unit]
Description=Memory Pressure Monitor
After=multi-user.target
[Service]
Type=oneshot
ExecStart=/usr/local/bin/memory-pressure-handler.sh
User=root
StandardOutput=journal
StandardError=journal
EOF
cat > /etc/systemd/system/memory-pressure-monitor.timer << 'EOF'
[Unit]
Description=Memory Pressure Monitor Timer
Requires=memory-pressure-monitor.service
[Timer]
OnBootSec=60
OnUnitActiveSec=30
Persistent=true
[Install]
WantedBy=timers.target
EOF
chmod 644 /etc/systemd/system/memory-pressure-monitor.*
systemctl daemon-reload
systemctl enable memory-pressure-monitor.timer
print_success "Created memory pressure monitoring"
}
# Configure container delegation
configure_delegation() {
print_status "[9/10] Configuring cgroups delegation..."
mkdir -p /etc/systemd/system.conf.d
cat > /etc/systemd/system.conf.d/cgroup-delegation.conf << 'EOF'
[Manager]
DefaultMemoryAccounting=yes
DefaultCPUAccounting=yes
DefaultIOAccounting=yes
DefaultIPAccounting=yes
Delegate=yes
EOF
chmod 644 /etc/systemd/system.conf.d/cgroup-delegation.conf
print_success "Configured cgroups delegation"
}
# Verify configuration
verify_setup() {
print_status "[10/10] Verifying installation..."
# Check systemd services
if systemctl list-unit-files | grep -q memory-test.service; then
print_success "Memory test service installed"
fi
if systemctl list-unit-files | grep -q memory-pressure-monitor.timer; then
print_success "Memory pressure monitor installed"
fi
# Check configuration files
local configs=(
"/etc/systemd/system/user@.service.d/memory-limits.conf"
"/etc/systemd/system.conf.d/cgroup-delegation.conf"
"/usr/local/bin/memory-pressure-handler.sh"
)
for config in "${configs[@]}"; do
if [[ -f "$config" ]]; then
print_success "Configuration file created: $config"
else
print_error "Missing configuration file: $config"
fi
done
print_success "Installation completed successfully!"
if [[ "$SKIP_REBOOT" == "false" ]]; then
print_warning "System will reboot in 10 seconds to activate cgroups v2..."
print_warning "Press Ctrl+C to cancel reboot"
sleep 10
reboot
else
print_warning "Manual reboot required to activate cgroups v2"
echo -e "${YELLOW}After reboot, verify with: mount | grep cgroup2${NC}"
fi
}
main() {
print_status "Starting cgroups v2 memory configuration..."
check_prerequisites
detect_distro
if ! check_cgroups_status; then
enable_cgroups_v2
fi
install_packages
create_test_service
configure_user_limits
create_pressure_handler
configure_delegation
verify_setup
}
main "$@"
Review the script before running. Execute with: bash install.sh