Learn how to optimize Linux I/O performance through kernel parameter tuning, storage scheduler configuration, and filesystem optimizations. This tutorial covers scheduler selection, queue depth tuning, and performance monitoring for high-throughput applications.

Prerequisites

Root access to the Linux server
Basic understanding of Linux command line
Storage devices to optimize (SSD, NVMe, or HDD)

What this solves

Poor I/O performance can severely impact database servers, web applications, and data processing workloads. Linux I/O schedulers and kernel parameters aren't optimized for all workload types by default, leading to bottlenecks in high-throughput scenarios.

This tutorial shows you how to identify I/O bottlenecks, select appropriate schedulers for different storage types, tune kernel parameters, and optimize filesystem mount options for maximum throughput and minimum latency.

Step-by-step configuration

Install monitoring and benchmarking tools

Install essential tools for monitoring I/O performance and benchmarking storage devices.

sudo apt update
sudo apt install -y sysstat iotop fio hdparm nvme-cli

sudo dnf install -y sysstat iotop fio hdparm nvme-cli

Analyze current I/O performance

Check current I/O scheduler settings and baseline performance before making changes.

# Check current schedulers for all block devices
for dev in /sys/block/*/queue/scheduler; do
  echo "$dev: $(cat $dev)"
done

Show current I/O statistics
iostat -x 1 3

Check NVMe device information (if applicable)
sudo nvme list

Configure I/O schedulers for different storage types

Set optimal schedulers based on storage technology. NVMe SSDs benefit from none or mq-deadline, while HDDs work better with bfq or mq-deadline.

# Check storage type and set appropriate scheduler
lsblk -d -o name,rota

For NVMe/SSD (rota=0) - use none scheduler
echo none | sudo tee /sys/block/nvme0n1/queue/scheduler

For HDDs (rota=1) - use bfq scheduler
echo bfq | sudo tee /sys/block/sda/queue/scheduler

For high-performance SSDs - use mq-deadline
echo mq-deadline | sudo tee /sys/block/nvme0n1/queue/scheduler

Make scheduler changes persistent

Create udev rules to automatically apply scheduler settings on boot based on device type.

# Set scheduler for NVMe devices
ACTION=="add|change", KERNEL=="nvme[0-9]n[0-9]", ATTR{queue/scheduler}="none"

Set scheduler for SSDs
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="mq-deadline"

Set scheduler for HDDs
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="bfq"

# Apply udev rules
sudo udevadm control --reload-rules
sudo udevadm trigger

Optimize I/O queue depths

Adjust queue depths to match storage capabilities and workload requirements. Higher queue depths improve throughput but may increase latency.

# Check current queue depth
cat /sys/block/nvme0n1/queue/nr_requests

Increase queue depth for high-throughput NVMe
echo 1024 | sudo tee /sys/block/nvme0n1/queue/nr_requests

Set read-ahead for sequential workloads (in KB)
echo 4096 | sudo tee /sys/block/nvme0n1/queue/read_ahead_kb

For databases, reduce read-ahead to minimize memory usage
echo 128 | sudo tee /sys/block/nvme0n1/queue/read_ahead_kb

Configure kernel I/O parameters

Optimize kernel parameters for better I/O performance, including dirty page handling and CPU scaling.

# Reduce dirty page writeback for consistent performance
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
vm.dirty_writeback_centisecs = 100
vm.dirty_expire_centisecs = 200

Optimize for I/O intensive workloads
vm.swappiness = 1
vm.vfs_cache_pressure = 50

Increase maximum number of memory map areas
vm.max_map_count = 262144

TCP buffer tuning for network I/O
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728

CPU scheduler for I/O bound processes
kernel.sched_migration_cost_ns = 5000000

# Apply kernel parameters
sudo sysctl -p /etc/sysctl.d/99-io-performance.conf

Note: The dirty page settings above optimize for consistent write performance. Lower values cause more frequent but smaller writes, reducing latency spikes.

Optimize filesystem mount options

Configure mount options for better I/O performance based on filesystem type and use case.

# High-performance ext4 options for databases
/dev/nvme0n1p1 /var/lib/mysql ext4 defaults,noatime,nobarrier,data=writeback 0 2

Balanced ext4 options for general use
/dev/nvme0n1p2 /opt/data ext4 defaults,noatime,commit=30 0 2

XFS options for large files and high throughput
/dev/nvme0n1p3 /var/lib/backups xfs defaults,noatime,logbsize=256k,largeio 0 2

# Test mount options before rebooting
sudo mount -o remount,noatime /var/lib/mysql

Verify mount options
mount | grep nvme

Warning: The nobarrier option improves performance but may risk data integrity during power failures. Only use with UPS protection or for non-critical data.

Configure per-process I/O scheduling

Set I/O priority classes for different applications to prevent I/O interference.

# Set real-time I/O priority for database
sudo ionice -c 1 -n 4 -p $(pgrep mysqld)

Set idle priority for backup processes
sudo ionice -c 3 -p $(pgrep backup)

Check I/O priorities
for pid in $(pgrep -f mysql); do
  echo "PID $pid: $(ionice -p $pid)"
done

Create I/O performance monitoring script

Set up continuous monitoring to track I/O performance improvements and identify bottlenecks.

#!/bin/bash

LOGFILE="/var/log/io-performance.log"
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')

echo "=== I/O Performance Report - $TIMESTAMP ===" >> $LOGFILE

Device statistics
echo "Device Statistics:" >> $LOGFILE
iostat -x 1 1 | grep -E '(Device|nvme|sd[a-z])' >> $LOGFILE

Top I/O processes
echo "Top I/O Processes:" >> $LOGFILE
iotop -a -o -d 1 -n 1 | head -20 >> $LOGFILE

Queue depths and schedulers
echo "Scheduler Configuration:" >> $LOGFILE
for dev in /sys/block/*/queue/scheduler; do
  device=$(echo $dev | cut -d'/' -f4)
  scheduler=$(cat $dev | grep -o '\[.*\]' | tr -d '[]')
  queue_depth=$(cat /sys/block/$device/queue/nr_requests)
  echo "$device: scheduler=$scheduler, queue_depth=$queue_depth" >> $LOGFILE
done

echo "" >> $LOGFILE

sudo chmod 755 /usr/local/bin/io-monitor.sh

Create systemd timer for regular monitoring
sudo tee /etc/systemd/system/io-monitor.timer > /dev/null << 'EOF'
[Unit]
Description=I/O Performance Monitor Timer

[Timer]
OnCalendar=*:0/10
Persistent=true

[Install]
WantedBy=timers.target
EOF

sudo tee /etc/systemd/system/io-monitor.service > /dev/null << 'EOF'
[Unit]
Description=I/O Performance Monitor

[Service]
Type=oneshot
ExecStart=/usr/local/bin/io-monitor.sh
EOF

Enable monitoring timer
sudo systemctl daemon-reload
sudo systemctl enable --now io-monitor.timer

Benchmark I/O improvements

Run comprehensive I/O benchmarks

Use fio to test different I/O patterns and measure performance improvements.

# Random read performance (database-like workload)
sudo fio --name=randread --ioengine=libaio --iodepth=16 --rw=randread --bs=4k --direct=1 --size=1G --numjobs=4 --runtime=60 --group_reporting --filename=/dev/nvme0n1

Sequential write performance (backup/logging workload)
sudo fio --name=seqwrite --ioengine=libaio --iodepth=32 --rw=write --bs=64k --direct=1 --size=2G --numjobs=2 --runtime=60 --group_reporting --filename=/dev/nvme0n1

Mixed workload test
sudo fio --name=mixed --ioengine=libaio --iodepth=16 --rw=randrw --rwmixread=70 --bs=4k --direct=1 --size=1G --numjobs=4 --runtime=60 --group_reporting --filename=/dev/nvme0n1

Warning: These fio tests write directly to the block device and will destroy data. Use a test partition or ensure you have backups.

Verify your setup

# Check active I/O schedulers
for dev in /sys/block/*/queue/scheduler; do
  echo "$dev: $(cat $dev)"
done

Verify kernel parameters
sudo sysctl -a | grep -E 'vm.dirty|vm.swappiness'

Check I/O statistics
iostat -x 1 3

Monitor top I/O processes
iotop -a -o -d 2

Check monitoring timer status
sudo systemctl status io-monitor.timer

Common issues

Symptom	Cause	Fix
High I/O wait times	Wrong scheduler for storage type	Switch to appropriate scheduler (none for NVMe, bfq for HDD)
Inconsistent write performance	Large dirty page ratio	Reduce vm.dirty_ratio to 10 or lower
Scheduler changes don't persist	Missing udev rules	Create /etc/udev/rules.d/60-io-schedulers.rules
Database slowdowns during backups	I/O priority conflicts	Set backup processes to idle priority with ionice -c 3
Low throughput on NVMe	Insufficient queue depth	Increase nr_requests to 1024 or higher
High memory usage	Excessive read-ahead buffering	Reduce read_ahead_kb for random access workloads

Next steps

Automated install script

Run this to automate the entire setup

install.sh

#!/usr/bin/env bash

set -euo pipefail

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color

# Global variables
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
BACKUP_DIR="/opt/io-tuning-backup-$(date +%Y%m%d-%H%M%S)"
PKG_MGR=""
PKG_INSTALL=""
WORKLOAD_TYPE="balanced"

# Usage function
usage() {
    echo "Usage: $0 [OPTIONS]"
    echo "Options:"
    echo "  -w, --workload TYPE    Workload type: database|web|storage|balanced (default: balanced)"
    echo "  -h, --help            Show this help message"
    echo ""
    echo "Examples:"
    echo "  $0                    # Apply balanced I/O optimizations"
    echo "  $0 -w database        # Optimize for database workloads"
    echo "  $0 -w storage         # Optimize for high-throughput storage"
    exit 1
}

# Parse command line arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        -w|--workload)
            WORKLOAD_TYPE="$2"
            if [[ ! "$WORKLOAD_TYPE" =~ ^(database|web|storage|balanced)$ ]]; then
                echo -e "${RED}Error: Invalid workload type. Use: database, web, storage, or balanced${NC}" >&2
                exit 1
            fi
            shift 2
            ;;
        -h|--help)
            usage
            ;;
        *)
            echo -e "${RED}Error: Unknown option $1${NC}" >&2
            usage
            ;;
    esac
done

# Cleanup function for rollback
cleanup() {
    if [[ $? -ne 0 ]]; then
        echo -e "${RED}Error occurred. Check logs and consider restoring from backup: $BACKUP_DIR${NC}" >&2
    fi
}

trap cleanup ERR

# Check prerequisites
check_prerequisites() {
    echo -e "${YELLOW}[1/8] Checking prerequisites...${NC}"
    
    if [[ $EUID -ne 0 ]]; then
        echo -e "${RED}This script must be run as root${NC}" >&2
        exit 1
    fi

    if [ ! -f /etc/os-release ]; then
        echo -e "${RED}/etc/os-release not found. Cannot determine distribution.${NC}" >&2
        exit 1
    fi

    . /etc/os-release
    case "$ID" in
        ubuntu|debian)
            PKG_MGR="apt"
            PKG_INSTALL="apt install -y"
            ;;
        almalinux|rocky|centos|rhel|ol|fedora)
            PKG_MGR="dnf"
            PKG_INSTALL="dnf install -y"
            ;;
        amzn)
            PKG_MGR="yum"
            PKG_INSTALL="yum install -y"
            ;;
        *)
            echo -e "${RED}Unsupported distribution: $ID${NC}" >&2
            exit 1
            ;;
    esac

    mkdir -p "$BACKUP_DIR"
    echo -e "${GREEN}Prerequisites check passed${NC}"
}

# Install monitoring tools
install_tools() {
    echo -e "${YELLOW}[2/8] Installing I/O monitoring and benchmarking tools...${NC}"
    
    if [[ "$PKG_MGR" == "apt" ]]; then
        apt update
    fi
    
    $PKG_INSTALL sysstat iotop fio hdparm util-linux
    
    # Install nvme-cli if available
    if $PKG_INSTALL nvme-cli 2>/dev/null; then
        echo -e "${GREEN}nvme-cli installed${NC}"
    else
        echo -e "${YELLOW}nvme-cli not available, skipping${NC}"
    fi
    
    echo -e "${GREEN}Tools installation completed${NC}"
}

# Backup current configuration
backup_config() {
    echo -e "${YELLOW}[3/8] Backing up current configuration...${NC}"
    
    # Backup current scheduler settings
    echo "# Current scheduler settings" > "$BACKUP_DIR/schedulers.txt"
    for dev in /sys/block/*/queue/scheduler 2>/dev/null; do
        if [[ -r "$dev" ]]; then
            echo "$dev: $(cat $dev)" >> "$BACKUP_DIR/schedulers.txt"
        fi
    done
    
    # Backup current sysctl settings
    sysctl -a > "$BACKUP_DIR/sysctl_original.conf" 2>/dev/null || true
    
    # Backup current udev rules
    if [[ -d /etc/udev/rules.d ]]; then
        cp -r /etc/udev/rules.d "$BACKUP_DIR/udev_rules_backup" 2>/dev/null || true
    fi
    
    echo -e "${GREEN}Configuration backed up to $BACKUP_DIR${NC}"
}

# Analyze current I/O performance
analyze_io() {
    echo -e "${YELLOW}[4/8] Analyzing current I/O performance...${NC}"
    
    echo "Current scheduler settings:"
    for dev in /sys/block/*/queue/scheduler 2>/dev/null; do
        if [[ -r "$dev" ]]; then
            echo "  $dev: $(cat $dev)"
        fi
    done
    
    echo -e "${GREEN}Current I/O analysis completed${NC}"
}

# Configure I/O schedulers
configure_schedulers() {
    echo -e "${YELLOW}[5/8] Configuring I/O schedulers...${NC}"
    
    # Create udev rules for persistent scheduler settings
    cat > /etc/udev/rules.d/60-io-schedulers.rules << 'EOF'
# I/O Scheduler optimization rules
# NVMe devices - use none scheduler for best performance
ACTION=="add|change", KERNEL=="nvme[0-9]n[0-9]", ATTR{queue/scheduler}="none"

# SSDs - use mq-deadline
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="mq-deadline"

# HDDs - use bfq for better responsiveness
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="bfq"
EOF

    # Apply schedulers immediately
    for dev in /sys/block/*/queue/scheduler 2>/dev/null; do
        if [[ -w "$dev" ]]; then
            device=$(echo "$dev" | cut -d'/' -f4)
            if [[ "$device" =~ ^nvme ]]; then
                if grep -q "none" "$dev"; then
                    echo "none" > "$dev" 2>/dev/null || true
                fi
            elif [[ "$device" =~ ^sd ]]; then
                rotational_file="/sys/block/$device/queue/rotational"
                if [[ -r "$rotational_file" ]]; then
                    if [[ "$(cat $rotational_file)" == "0" ]]; then
                        # SSD
                        if grep -q "mq-deadline" "$dev"; then
                            echo "mq-deadline" > "$dev" 2>/dev/null || true
                        fi
                    else
                        # HDD
                        if grep -q "bfq" "$dev"; then
                            echo "bfq" > "$dev" 2>/dev/null || true
                        fi
                    fi
                fi
            fi
        fi
    done

    # Reload udev rules
    udevadm control --reload-rules
    udevadm trigger
    
    echo -e "${GREEN}I/O schedulers configured${NC}"
}

# Configure kernel parameters
configure_kernel() {
    echo -e "${YELLOW}[6/8] Configuring kernel I/O parameters...${NC}"
    
    # Set parameters based on workload type
    case "$WORKLOAD_TYPE" in
        database)
            DIRTY_BG_RATIO=5
            DIRTY_RATIO=10
            SWAPPINESS=1
            READ_AHEAD=128
            ;;
        storage)
            DIRTY_BG_RATIO=15
            DIRTY_RATIO=30
            SWAPPINESS=10
            READ_AHEAD=4096
            ;;
        web)
            DIRTY_BG_RATIO=10
            DIRTY_RATIO=20
            SWAPPINESS=5
            READ_AHEAD=512
            ;;
        *)
            DIRTY_BG_RATIO=5
            DIRTY_RATIO=15
            SWAPPINESS=1
            READ_AHEAD=1024
            ;;
    esac

    cat > /etc/sysctl.d/99-io-performance.conf << EOF
# I/O Performance optimizations for $WORKLOAD_TYPE workload
vm.dirty_background_ratio = $DIRTY_BG_RATIO
vm.dirty_ratio = $DIRTY_RATIO
vm.dirty_writeback_centisecs = 100
vm.dirty_expire_centisecs = 200
vm.swappiness = $SWAPPINESS
vm.vfs_cache_pressure = 50
vm.max_map_count = 262144

# Network I/O optimization
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728

# CPU scheduler optimization
kernel.sched_migration_cost_ns = 5000000
EOF

    # Apply immediately
    sysctl -p /etc/sysctl.d/99-io-performance.conf
    
    echo -e "${GREEN}Kernel parameters configured${NC}"
}

# Optimize queue depths and read-ahead
optimize_queues() {
    echo -e "${YELLOW}[7/8] Optimizing I/O queue depths and read-ahead...${NC}"
    
    for dev in /sys/block/*/queue/nr_requests 2>/dev/null; do
        if [[ -w "$dev" ]]; then
            device=$(echo "$dev" | cut -d'/' -f4)
            
            # Set queue depth based on device type
            if [[ "$device" =~ ^nvme ]]; then
                echo "1024" > "$dev" 2>/dev/null || true
            else
                echo "512" > "$dev" 2>/dev/null || true
            fi
        fi
    done
    
    # Set read-ahead based on workload
    for dev in /sys/block/*/queue/read_ahead_kb 2>/dev/null; do
        if [[ -w "$dev" ]]; then
            echo "$READ_AHEAD" > "$dev" 2>/dev/null || true
        fi
    done
    
    echo -e "${GREEN}Queue optimization completed${NC}"
}

# Verification
verify_setup() {
    echo -e "${YELLOW}[8/8] Verifying configuration...${NC}"
    
    echo "Current I/O scheduler settings:"
    for dev in /sys/block/*/queue/scheduler 2>/dev/null; do
        if [[ -r "$dev" ]]; then
            device=$(echo "$dev" | cut -d'/' -f4)
            scheduler=$(cat "$dev" | grep -o '\[.*\]' | tr -d '[]')
            echo "  $device: $scheduler"
        fi
    done
    
    echo ""
    echo "Applied kernel parameters:"
    sysctl vm.dirty_background_ratio vm.dirty_ratio vm.swappiness 2>/dev/null || true
    
    echo ""
    echo -e "${GREEN}I/O optimization completed successfully!${NC}"
    echo -e "${GREEN}Workload type: $WORKLOAD_TYPE${NC}"
    echo -e "${GREEN}Backup location: $BACKUP_DIR${NC}"
    echo ""
    echo "To monitor I/O performance, use:"
    echo "  iostat -x 1"
    echo "  iotop"
    echo "  fio (for benchmarking)"
}

# Main execution
main() {
    echo -e "${GREEN}Starting Linux I/O Performance Optimization${NC}"
    echo -e "${GREEN}Workload type: $WORKLOAD_TYPE${NC}"
    echo ""
    
    check_prerequisites
    install_tools
    backup_config
    analyze_io
    configure_schedulers
    configure_kernel
    optimize_queues
    verify_setup
}

main "$@"

Review the script before running. Execute with: bash install.sh

#linux io performance #storage scheduler #nvme tuning #kernel parameters #iostat #fio benchmarking

Optimize Linux I/O performance with kernel tuning and storage schedulers for high-throughput workloads

Prerequisites

What this solves

Step-by-step configuration

Install monitoring and benchmarking tools

Analyze current I/O performance

Show current I/O statistics

Check NVMe device information (if applicable)

Configure I/O schedulers for different storage types

For NVMe/SSD (rota=0) - use none scheduler

For HDDs (rota=1) - use bfq scheduler

For high-performance SSDs - use mq-deadline

Make scheduler changes persistent

Set scheduler for SSDs

Set scheduler for HDDs

Optimize I/O queue depths

Increase queue depth for high-throughput NVMe

Set read-ahead for sequential workloads (in KB)

For databases, reduce read-ahead to minimize memory usage

Configure kernel I/O parameters

Optimize for I/O intensive workloads

Increase maximum number of memory map areas

TCP buffer tuning for network I/O

CPU scheduler for I/O bound processes

Optimize filesystem mount options

Balanced ext4 options for general use

XFS options for large files and high throughput

Verify mount options

Configure per-process I/O scheduling

Set idle priority for backup processes

Check I/O priorities

Create I/O performance monitoring script

Device statistics

Top I/O processes

Queue depths and schedulers

Create systemd timer for regular monitoring

Enable monitoring timer

Benchmark I/O improvements

Run comprehensive I/O benchmarks

Sequential write performance (backup/logging workload)

Mixed workload test

Verify your setup

Verify kernel parameters

Check I/O statistics

Monitor top I/O processes

Check monitoring timer status

Common issues

Next steps

Related tutorials

Optimize PHP-FPM performance with memory tuning and connection pooling for high-traffic websites

Benchmark and optimize Linux disk I/O performance with fio testing

Optimize Linux system performance with htop process monitoring and resource analysis

Don't want to manage this yourself?