Learn how to optimize Linux I/O performance through kernel parameter tuning, storage scheduler configuration, and filesystem optimizations. This tutorial covers scheduler selection, queue depth tuning, and performance monitoring for high-throughput applications.
Prerequisites
- Root access to the Linux server
- Basic understanding of Linux command line
- Storage devices to optimize (SSD, NVMe, or HDD)
What this solves
Poor I/O performance can severely impact database servers, web applications, and data processing workloads. Linux I/O schedulers and kernel parameters aren't optimized for all workload types by default, leading to bottlenecks in high-throughput scenarios.
This tutorial shows you how to identify I/O bottlenecks, select appropriate schedulers for different storage types, tune kernel parameters, and optimize filesystem mount options for maximum throughput and minimum latency.
Step-by-step configuration
Install monitoring and benchmarking tools
Install essential tools for monitoring I/O performance and benchmarking storage devices.
sudo apt update
sudo apt install -y sysstat iotop fio hdparm nvme-cli
Analyze current I/O performance
Check current I/O scheduler settings and baseline performance before making changes.
# Check current schedulers for all block devices
for dev in /sys/block/*/queue/scheduler; do
echo "$dev: $(cat $dev)"
done
Show current I/O statistics
iostat -x 1 3
Check NVMe device information (if applicable)
sudo nvme list
Configure I/O schedulers for different storage types
Set optimal schedulers based on storage technology. NVMe SSDs benefit from none or mq-deadline, while HDDs work better with bfq or mq-deadline.
# Check storage type and set appropriate scheduler
lsblk -d -o name,rota
For NVMe/SSD (rota=0) - use none scheduler
echo none | sudo tee /sys/block/nvme0n1/queue/scheduler
For HDDs (rota=1) - use bfq scheduler
echo bfq | sudo tee /sys/block/sda/queue/scheduler
For high-performance SSDs - use mq-deadline
echo mq-deadline | sudo tee /sys/block/nvme0n1/queue/scheduler
Make scheduler changes persistent
Create udev rules to automatically apply scheduler settings on boot based on device type.
# Set scheduler for NVMe devices
ACTION=="add|change", KERNEL=="nvme[0-9]n[0-9]", ATTR{queue/scheduler}="none"
Set scheduler for SSDs
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="mq-deadline"
Set scheduler for HDDs
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="bfq"
# Apply udev rules
sudo udevadm control --reload-rules
sudo udevadm trigger
Optimize I/O queue depths
Adjust queue depths to match storage capabilities and workload requirements. Higher queue depths improve throughput but may increase latency.
# Check current queue depth
cat /sys/block/nvme0n1/queue/nr_requests
Increase queue depth for high-throughput NVMe
echo 1024 | sudo tee /sys/block/nvme0n1/queue/nr_requests
Set read-ahead for sequential workloads (in KB)
echo 4096 | sudo tee /sys/block/nvme0n1/queue/read_ahead_kb
For databases, reduce read-ahead to minimize memory usage
echo 128 | sudo tee /sys/block/nvme0n1/queue/read_ahead_kb
Configure kernel I/O parameters
Optimize kernel parameters for better I/O performance, including dirty page handling and CPU scaling.
# Reduce dirty page writeback for consistent performance
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
vm.dirty_writeback_centisecs = 100
vm.dirty_expire_centisecs = 200
Optimize for I/O intensive workloads
vm.swappiness = 1
vm.vfs_cache_pressure = 50
Increase maximum number of memory map areas
vm.max_map_count = 262144
TCP buffer tuning for network I/O
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728
CPU scheduler for I/O bound processes
kernel.sched_migration_cost_ns = 5000000
# Apply kernel parameters
sudo sysctl -p /etc/sysctl.d/99-io-performance.conf
Optimize filesystem mount options
Configure mount options for better I/O performance based on filesystem type and use case.
# High-performance ext4 options for databases
/dev/nvme0n1p1 /var/lib/mysql ext4 defaults,noatime,nobarrier,data=writeback 0 2
Balanced ext4 options for general use
/dev/nvme0n1p2 /opt/data ext4 defaults,noatime,commit=30 0 2
XFS options for large files and high throughput
/dev/nvme0n1p3 /var/lib/backups xfs defaults,noatime,logbsize=256k,largeio 0 2
# Test mount options before rebooting
sudo mount -o remount,noatime /var/lib/mysql
Verify mount options
mount | grep nvme
Configure per-process I/O scheduling
Set I/O priority classes for different applications to prevent I/O interference.
# Set real-time I/O priority for database
sudo ionice -c 1 -n 4 -p $(pgrep mysqld)
Set idle priority for backup processes
sudo ionice -c 3 -p $(pgrep backup)
Check I/O priorities
for pid in $(pgrep -f mysql); do
echo "PID $pid: $(ionice -p $pid)"
done
Create I/O performance monitoring script
Set up continuous monitoring to track I/O performance improvements and identify bottlenecks.
#!/bin/bash
LOGFILE="/var/log/io-performance.log"
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
echo "=== I/O Performance Report - $TIMESTAMP ===" >> $LOGFILE
Device statistics
echo "Device Statistics:" >> $LOGFILE
iostat -x 1 1 | grep -E '(Device|nvme|sd[a-z])' >> $LOGFILE
Top I/O processes
echo "Top I/O Processes:" >> $LOGFILE
iotop -a -o -d 1 -n 1 | head -20 >> $LOGFILE
Queue depths and schedulers
echo "Scheduler Configuration:" >> $LOGFILE
for dev in /sys/block/*/queue/scheduler; do
device=$(echo $dev | cut -d'/' -f4)
scheduler=$(cat $dev | grep -o '\[.*\]' | tr -d '[]')
queue_depth=$(cat /sys/block/$device/queue/nr_requests)
echo "$device: scheduler=$scheduler, queue_depth=$queue_depth" >> $LOGFILE
done
echo "" >> $LOGFILE
sudo chmod 755 /usr/local/bin/io-monitor.sh
Create systemd timer for regular monitoring
sudo tee /etc/systemd/system/io-monitor.timer > /dev/null << 'EOF'
[Unit]
Description=I/O Performance Monitor Timer
[Timer]
OnCalendar=*:0/10
Persistent=true
[Install]
WantedBy=timers.target
EOF
sudo tee /etc/systemd/system/io-monitor.service > /dev/null << 'EOF'
[Unit]
Description=I/O Performance Monitor
[Service]
Type=oneshot
ExecStart=/usr/local/bin/io-monitor.sh
EOF
Enable monitoring timer
sudo systemctl daemon-reload
sudo systemctl enable --now io-monitor.timer
Benchmark I/O improvements
Run comprehensive I/O benchmarks
Use fio to test different I/O patterns and measure performance improvements.
# Random read performance (database-like workload)
sudo fio --name=randread --ioengine=libaio --iodepth=16 --rw=randread --bs=4k --direct=1 --size=1G --numjobs=4 --runtime=60 --group_reporting --filename=/dev/nvme0n1
Sequential write performance (backup/logging workload)
sudo fio --name=seqwrite --ioengine=libaio --iodepth=32 --rw=write --bs=64k --direct=1 --size=2G --numjobs=2 --runtime=60 --group_reporting --filename=/dev/nvme0n1
Mixed workload test
sudo fio --name=mixed --ioengine=libaio --iodepth=16 --rw=randrw --rwmixread=70 --bs=4k --direct=1 --size=1G --numjobs=4 --runtime=60 --group_reporting --filename=/dev/nvme0n1
Verify your setup
# Check active I/O schedulers
for dev in /sys/block/*/queue/scheduler; do
echo "$dev: $(cat $dev)"
done
Verify kernel parameters
sudo sysctl -a | grep -E 'vm.dirty|vm.swappiness'
Check I/O statistics
iostat -x 1 3
Monitor top I/O processes
iotop -a -o -d 2
Check monitoring timer status
sudo systemctl status io-monitor.timer
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| High I/O wait times | Wrong scheduler for storage type | Switch to appropriate scheduler (none for NVMe, bfq for HDD) |
| Inconsistent write performance | Large dirty page ratio | Reduce vm.dirty_ratio to 10 or lower |
| Scheduler changes don't persist | Missing udev rules | Create /etc/udev/rules.d/60-io-schedulers.rules |
| Database slowdowns during backups | I/O priority conflicts | Set backup processes to idle priority with ionice -c 3 |
| Low throughput on NVMe | Insufficient queue depth | Increase nr_requests to 1024 or higher |
| High memory usage | Excessive read-ahead buffering | Reduce read_ahead_kb for random access workloads |
Next steps
- Configure Linux memory management and swap optimization for high-performance workloads
- Optimize Linux system performance with kernel parameters and system tuning
- Configure Linux disk usage monitoring and automated cleanup with systemd timers
- Set up Linux storage monitoring with smartmontools and automated health alerts
- Configure Linux filesystem tuning and optimization for database workloads
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
# Global variables
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
BACKUP_DIR="/opt/io-tuning-backup-$(date +%Y%m%d-%H%M%S)"
PKG_MGR=""
PKG_INSTALL=""
WORKLOAD_TYPE="balanced"
# Usage function
usage() {
echo "Usage: $0 [OPTIONS]"
echo "Options:"
echo " -w, --workload TYPE Workload type: database|web|storage|balanced (default: balanced)"
echo " -h, --help Show this help message"
echo ""
echo "Examples:"
echo " $0 # Apply balanced I/O optimizations"
echo " $0 -w database # Optimize for database workloads"
echo " $0 -w storage # Optimize for high-throughput storage"
exit 1
}
# Parse command line arguments
while [[ $# -gt 0 ]]; do
case $1 in
-w|--workload)
WORKLOAD_TYPE="$2"
if [[ ! "$WORKLOAD_TYPE" =~ ^(database|web|storage|balanced)$ ]]; then
echo -e "${RED}Error: Invalid workload type. Use: database, web, storage, or balanced${NC}" >&2
exit 1
fi
shift 2
;;
-h|--help)
usage
;;
*)
echo -e "${RED}Error: Unknown option $1${NC}" >&2
usage
;;
esac
done
# Cleanup function for rollback
cleanup() {
if [[ $? -ne 0 ]]; then
echo -e "${RED}Error occurred. Check logs and consider restoring from backup: $BACKUP_DIR${NC}" >&2
fi
}
trap cleanup ERR
# Check prerequisites
check_prerequisites() {
echo -e "${YELLOW}[1/8] Checking prerequisites...${NC}"
if [[ $EUID -ne 0 ]]; then
echo -e "${RED}This script must be run as root${NC}" >&2
exit 1
fi
if [ ! -f /etc/os-release ]; then
echo -e "${RED}/etc/os-release not found. Cannot determine distribution.${NC}" >&2
exit 1
fi
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_INSTALL="apt install -y"
;;
almalinux|rocky|centos|rhel|ol|fedora)
PKG_MGR="dnf"
PKG_INSTALL="dnf install -y"
;;
amzn)
PKG_MGR="yum"
PKG_INSTALL="yum install -y"
;;
*)
echo -e "${RED}Unsupported distribution: $ID${NC}" >&2
exit 1
;;
esac
mkdir -p "$BACKUP_DIR"
echo -e "${GREEN}Prerequisites check passed${NC}"
}
# Install monitoring tools
install_tools() {
echo -e "${YELLOW}[2/8] Installing I/O monitoring and benchmarking tools...${NC}"
if [[ "$PKG_MGR" == "apt" ]]; then
apt update
fi
$PKG_INSTALL sysstat iotop fio hdparm util-linux
# Install nvme-cli if available
if $PKG_INSTALL nvme-cli 2>/dev/null; then
echo -e "${GREEN}nvme-cli installed${NC}"
else
echo -e "${YELLOW}nvme-cli not available, skipping${NC}"
fi
echo -e "${GREEN}Tools installation completed${NC}"
}
# Backup current configuration
backup_config() {
echo -e "${YELLOW}[3/8] Backing up current configuration...${NC}"
# Backup current scheduler settings
echo "# Current scheduler settings" > "$BACKUP_DIR/schedulers.txt"
for dev in /sys/block/*/queue/scheduler 2>/dev/null; do
if [[ -r "$dev" ]]; then
echo "$dev: $(cat $dev)" >> "$BACKUP_DIR/schedulers.txt"
fi
done
# Backup current sysctl settings
sysctl -a > "$BACKUP_DIR/sysctl_original.conf" 2>/dev/null || true
# Backup current udev rules
if [[ -d /etc/udev/rules.d ]]; then
cp -r /etc/udev/rules.d "$BACKUP_DIR/udev_rules_backup" 2>/dev/null || true
fi
echo -e "${GREEN}Configuration backed up to $BACKUP_DIR${NC}"
}
# Analyze current I/O performance
analyze_io() {
echo -e "${YELLOW}[4/8] Analyzing current I/O performance...${NC}"
echo "Current scheduler settings:"
for dev in /sys/block/*/queue/scheduler 2>/dev/null; do
if [[ -r "$dev" ]]; then
echo " $dev: $(cat $dev)"
fi
done
echo -e "${GREEN}Current I/O analysis completed${NC}"
}
# Configure I/O schedulers
configure_schedulers() {
echo -e "${YELLOW}[5/8] Configuring I/O schedulers...${NC}"
# Create udev rules for persistent scheduler settings
cat > /etc/udev/rules.d/60-io-schedulers.rules << 'EOF'
# I/O Scheduler optimization rules
# NVMe devices - use none scheduler for best performance
ACTION=="add|change", KERNEL=="nvme[0-9]n[0-9]", ATTR{queue/scheduler}="none"
# SSDs - use mq-deadline
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="mq-deadline"
# HDDs - use bfq for better responsiveness
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="bfq"
EOF
# Apply schedulers immediately
for dev in /sys/block/*/queue/scheduler 2>/dev/null; do
if [[ -w "$dev" ]]; then
device=$(echo "$dev" | cut -d'/' -f4)
if [[ "$device" =~ ^nvme ]]; then
if grep -q "none" "$dev"; then
echo "none" > "$dev" 2>/dev/null || true
fi
elif [[ "$device" =~ ^sd ]]; then
rotational_file="/sys/block/$device/queue/rotational"
if [[ -r "$rotational_file" ]]; then
if [[ "$(cat $rotational_file)" == "0" ]]; then
# SSD
if grep -q "mq-deadline" "$dev"; then
echo "mq-deadline" > "$dev" 2>/dev/null || true
fi
else
# HDD
if grep -q "bfq" "$dev"; then
echo "bfq" > "$dev" 2>/dev/null || true
fi
fi
fi
fi
fi
done
# Reload udev rules
udevadm control --reload-rules
udevadm trigger
echo -e "${GREEN}I/O schedulers configured${NC}"
}
# Configure kernel parameters
configure_kernel() {
echo -e "${YELLOW}[6/8] Configuring kernel I/O parameters...${NC}"
# Set parameters based on workload type
case "$WORKLOAD_TYPE" in
database)
DIRTY_BG_RATIO=5
DIRTY_RATIO=10
SWAPPINESS=1
READ_AHEAD=128
;;
storage)
DIRTY_BG_RATIO=15
DIRTY_RATIO=30
SWAPPINESS=10
READ_AHEAD=4096
;;
web)
DIRTY_BG_RATIO=10
DIRTY_RATIO=20
SWAPPINESS=5
READ_AHEAD=512
;;
*)
DIRTY_BG_RATIO=5
DIRTY_RATIO=15
SWAPPINESS=1
READ_AHEAD=1024
;;
esac
cat > /etc/sysctl.d/99-io-performance.conf << EOF
# I/O Performance optimizations for $WORKLOAD_TYPE workload
vm.dirty_background_ratio = $DIRTY_BG_RATIO
vm.dirty_ratio = $DIRTY_RATIO
vm.dirty_writeback_centisecs = 100
vm.dirty_expire_centisecs = 200
vm.swappiness = $SWAPPINESS
vm.vfs_cache_pressure = 50
vm.max_map_count = 262144
# Network I/O optimization
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728
# CPU scheduler optimization
kernel.sched_migration_cost_ns = 5000000
EOF
# Apply immediately
sysctl -p /etc/sysctl.d/99-io-performance.conf
echo -e "${GREEN}Kernel parameters configured${NC}"
}
# Optimize queue depths and read-ahead
optimize_queues() {
echo -e "${YELLOW}[7/8] Optimizing I/O queue depths and read-ahead...${NC}"
for dev in /sys/block/*/queue/nr_requests 2>/dev/null; do
if [[ -w "$dev" ]]; then
device=$(echo "$dev" | cut -d'/' -f4)
# Set queue depth based on device type
if [[ "$device" =~ ^nvme ]]; then
echo "1024" > "$dev" 2>/dev/null || true
else
echo "512" > "$dev" 2>/dev/null || true
fi
fi
done
# Set read-ahead based on workload
for dev in /sys/block/*/queue/read_ahead_kb 2>/dev/null; do
if [[ -w "$dev" ]]; then
echo "$READ_AHEAD" > "$dev" 2>/dev/null || true
fi
done
echo -e "${GREEN}Queue optimization completed${NC}"
}
# Verification
verify_setup() {
echo -e "${YELLOW}[8/8] Verifying configuration...${NC}"
echo "Current I/O scheduler settings:"
for dev in /sys/block/*/queue/scheduler 2>/dev/null; do
if [[ -r "$dev" ]]; then
device=$(echo "$dev" | cut -d'/' -f4)
scheduler=$(cat "$dev" | grep -o '\[.*\]' | tr -d '[]')
echo " $device: $scheduler"
fi
done
echo ""
echo "Applied kernel parameters:"
sysctl vm.dirty_background_ratio vm.dirty_ratio vm.swappiness 2>/dev/null || true
echo ""
echo -e "${GREEN}I/O optimization completed successfully!${NC}"
echo -e "${GREEN}Workload type: $WORKLOAD_TYPE${NC}"
echo -e "${GREEN}Backup location: $BACKUP_DIR${NC}"
echo ""
echo "To monitor I/O performance, use:"
echo " iostat -x 1"
echo " iotop"
echo " fio (for benchmarking)"
}
# Main execution
main() {
echo -e "${GREEN}Starting Linux I/O Performance Optimization${NC}"
echo -e "${GREEN}Workload type: $WORKLOAD_TYPE${NC}"
echo ""
check_prerequisites
install_tools
backup_config
analyze_io
configure_schedulers
configure_kernel
optimize_queues
verify_setup
}
main "$@"
Review the script before running. Execute with: bash install.sh