Optimize Elasticsearch 8 Indexing Performance

Configure Elasticsearch 8 for maximum indexing performance when handling large datasets through bulk API optimization, JVM memory tuning, and index mapping strategies. This guide covers production-grade performance tuning for high-throughput indexing workloads.

Prerequisites

Elasticsearch 8 installed and running
At least 16GB RAM available
Root or sudo access
Python 3.6+ installed

What this solves

When indexing large datasets into Elasticsearch, default configurations often result in poor performance, high memory usage, and indexing timeouts. This tutorial shows you how to optimize Elasticsearch 8 for high-throughput bulk indexing operations through JVM heap tuning, bulk API configuration, index settings optimization, and OS-level performance improvements.

You'll learn to handle datasets with millions of documents efficiently while maintaining cluster stability and search performance. These optimizations are essential for log aggregation systems, data lakes, and real-time analytics platforms that require fast data ingestion.

Prerequisites and system requirements

This tutorial assumes you have Elasticsearch 8 already installed and running. If you need to install Elasticsearch first, follow our Elasticsearch 8 installation guide.

Your system should have at least 16GB RAM for optimal performance with the configurations shown here. For production environments processing large datasets, 32GB or more is recommended.

Step-by-step performance optimization

Configure JVM heap size for optimal memory usage

Set the JVM heap to 50% of available RAM, with a maximum of 32GB. This leaves memory for the OS file system cache, which Elasticsearch uses heavily for performance.

# Set initial and maximum heap size
-Xms16g
-Xmx16g

Enable G1GC for better large heap performance
-XX:+UseG1GC
-XX:G1HeapRegionSize=32m
-XX:+G1UseAdaptiveIHOP
-XX:G1MixedGCCountTarget=8
-XX:G1HeapWastePercent=5

Optimize GC logging for monitoring
-Xlog:gc*,gc+age=trace,safepoint:gc.log:time,level,tags
-XX:+UnlockDiagnosticVMOptions
-XX:+LogVMOutput

Optimize Elasticsearch cluster settings for bulk operations

Configure cluster-level settings to handle high indexing loads efficiently. These settings increase thread pools and improve bulk operation handling.

curl -X PUT "localhost:9200/_cluster/settings" -H "Content-Type: application/json" -d '
{
  "persistent": {
    "thread_pool.write.queue_size": 1000,
    "thread_pool.write.size": 8,
    "indices.memory.index_buffer_size": "20%",
    "indices.memory.min_index_buffer_size": "96mb",
    "cluster.routing.allocation.disk.watermark.low": "85%",
    "cluster.routing.allocation.disk.watermark.high": "90%",
    "cluster.routing.allocation.disk.watermark.flood_stage": "95%"
  }
}'

Create optimized index templates for bulk indexing

Set up index templates with settings optimized for high-throughput indexing. These settings reduce replica overhead during bulk operations and optimize segment merging.

curl -X PUT "localhost:9200/_index_template/bulk_optimized_template" -H "Content-Type: application/json" -d '
{
  "index_patterns": ["logs-", "metrics-", "bulk-*"],
  "template": {
    "settings": {
      "number_of_shards": 2,
      "number_of_replicas": 0,
      "refresh_interval": "30s",
      "index.translog.flush_threshold_size": "1gb",
      "index.translog.sync_interval": "30s",
      "index.merge.policy.max_merge_at_once": 5,
      "index.merge.policy.segments_per_tier": 5,
      "index.merge.scheduler.max_thread_count": 2,
      "index.codec": "best_compression",
      "index.mapping.total_fields.limit": 10000
    },
    "mappings": {
      "dynamic_templates": [
        {
          "strings_as_keywords": {
            "match_mapping_type": "string",
            "mapping": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        {
          "timestamps": {
            "match": "timestamp",
            "mapping": {
              "type": "date",
              "format": "strict_date_optional_time||epoch_millis"
            }
          }
        }
      ]
    }
  }
}'

Configure OS-level performance optimizations

Optimize kernel parameters and file system settings for Elasticsearch performance. These changes improve I/O performance and memory management.

# Virtual memory settings
vm.max_map_count=262144
vm.swappiness=1
vm.dirty_ratio=15
vm.dirty_background_ratio=5

Network settings
net.core.somaxconn=32768
net.core.netdev_max_backlog=5000
net.core.rmem_default=262144
net.core.rmem_max=16777216
net.core.wmem_default=262144
net.core.wmem_max=16777216

File system settings
fs.file-max=1000000

Apply the kernel parameter changes:

sudo sysctl -p /etc/sysctl.d/99-elasticsearch.conf

Set file descriptor limits for Elasticsearch user

Increase file descriptor limits to handle large numbers of concurrent connections and open files during bulk operations.

elasticsearch soft nofile 1000000
elasticsearch hard nofile 1000000
elasticsearch soft nproc 32768
elasticsearch hard nproc 32768
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited

Configure systemd service limits

Update the Elasticsearch systemd service to use the increased limits and prevent memory swapping.

[Service]
LimitNOFILE=1000000
LimitNPROC=32768
LimitMEMLOCK=infinity
TimeoutStartSec=180

Reload systemd and restart Elasticsearch:

sudo systemctl daemon-reload
sudo systemctl restart elasticsearch
sudo systemctl status elasticsearch

Create bulk indexing script with optimal batch sizes

Create a Python script that demonstrates optimal bulk indexing techniques with proper error handling and performance monitoring.

#!/usr/bin/env python3
import json
import time
import requests
from datetime import datetime, timezone
import threading
from queue import Queue

class BulkIndexer:
    def __init__(self, es_host="localhost:9200", batch_size=5000, workers=4):
        self.es_host = es_host
        self.batch_size = batch_size
        self.workers = workers
        self.queue = Queue(maxsize=workers * 2)
        self.stats = {
            'indexed': 0,
            'errors': 0,
            'start_time': time.time()
        }
    
    def bulk_index_worker(self):
        """Worker thread for bulk indexing"""
        session = requests.Session()
        while True:
            batch = self.queue.get()
            if batch is None:
                break
            
            try:
                response = session.post(
                    f"http://{self.es_host}/_bulk",
                    data=batch,
                    headers={'Content-Type': 'application/x-ndjson'},
                    timeout=300
                )
                
                if response.status_code == 200:
                    result = response.json()
                    self.stats['indexed'] += len(result.get('items', []))
                    
                    # Check for indexing errors
                    for item in result.get('items', []):
                        if 'error' in item.get('index', {}):
                            self.stats['errors'] += 1
                            print(f"Index error: {item['index']['error']}")
                else:
                    self.stats['errors'] += 1
                    print(f"HTTP Error: {response.status_code} - {response.text}")
                    
            except Exception as e:
                self.stats['errors'] += 1
                print(f"Request error: {e}")
            finally:
                self.queue.task_done()
    
    def index_documents(self, documents, index_name):
        """Index documents using bulk API with optimal batching"""
        # Start worker threads
        threads = []
        for _ in range(self.workers):
            t = threading.Thread(target=self.bulk_index_worker)
            t.daemon = True
            t.start()
            threads.append(t)
        
        # Process documents in batches
        batch = []
        for doc in documents:
            # Add index action
            action = {"index": {"_index": index_name}}
            batch.append(json.dumps(action))
            batch.append(json.dumps(doc))
            
            if len(batch) >= self.batch_size  2:  # 2 because each doc has action + data
                self.queue.put('\n'.join(batch) + '\n')
                batch = []
        
        # Process remaining documents
        if batch:
            self.queue.put('\n'.join(batch) + '\n')
        
        # Wait for all tasks to complete
        self.queue.join()
        
        # Stop workers
        for _ in range(self.workers):
            self.queue.put(None)
        for t in threads:
            t.join()
        
        # Print statistics
        elapsed = time.time() - self.stats['start_time']
        rate = self.stats['indexed'] / elapsed if elapsed > 0 else 0
        print(f"Indexing complete:")
        print(f"  Documents: {self.stats['indexed']}")
        print(f"  Errors: {self.stats['errors']}")
        print(f"  Time: {elapsed:.2f}s")
        print(f"  Rate: {rate:.2f} docs/sec")

Example usage
if __name__ == "__main__":
    # Generate sample documents
    def generate_sample_docs(count=100000):
        for i in range(count):
            yield {
                "@timestamp": datetime.now(timezone.utc).isoformat(),
                "message": f"Sample log message {i}",
                "level": "INFO" if i % 10 != 0 else "ERROR",
                "user_id": i % 1000,
                "request_id": f"req_{i}",
                "response_time": (i % 100) * 10
            }
    
    indexer = BulkIndexer(batch_size=5000, workers=4)
    docs = generate_sample_docs(100000)
    indexer.index_documents(docs, "bulk-test-index")

Make the script executable:

sudo chmod 755 /opt/elasticsearch/bulk_indexer.py

Install Python dependencies for bulk indexing

Install the required Python packages for the bulk indexing script.

sudo apt update
sudo apt install -y python3-pip python3-venv
python3 -m venv /opt/elasticsearch/venv
source /opt/elasticsearch/venv/bin/activate
pip install requests

sudo dnf update -y
sudo dnf install -y python3-pip python3-venv
python3 -m venv /opt/elasticsearch/venv
source /opt/elasticsearch/venv/bin/activate
pip install requests

Configure monitoring for indexing performance

Set up monitoring to track indexing performance and identify bottlenecks. These API calls provide real-time metrics during bulk operations.

#!/bin/bash

echo "=== Elasticsearch Indexing Performance Monitor ==="
echo "Press Ctrl+C to stop monitoring"
echo

while true; do
    echo "--- $(date) ---"
    
    # Indexing stats
    curl -s "localhost:9200/_stats/indexing" | jq -r '
        .indices | to_entries[] | 
        "\(.key): indexed=\(.value.total.indexing.index_total) time=\(.value.total.indexing.index_time_in_millis)ms"
    ' | head -5
    
    echo
    
    # Thread pool stats
    curl -s "localhost:9200/_cat/thread_pool/write?v&h=node_name,name,active,queue,rejected,completed"
    
    echo
    
    # JVM stats
    curl -s "localhost:9200/_nodes/stats/jvm" | jq -r '
        .nodes[] | 
        "JVM: heap_used=\(.jvm.mem.heap_used_percent)% gc_time=\(.jvm.gc.collectors.old.collection_time_in_millis)ms"
    '
    
    echo "----------------------------------------"
    sleep 10
done

sudo chmod 755 /opt/elasticsearch/monitor_indexing.sh

Optimize index settings for specific use cases

Configure time-based indices for log data

For time-series data like logs, use daily or hourly indices with Index Lifecycle Management (ILM) to optimize performance and storage.

curl -X PUT "localhost:9200/_ilm/policy/logs_policy" -H "Content-Type: application/json" -d '
{
  "policy": {
    "phases": {
      "hot": {
        "actions": {
          "rollover": {
            "max_size": "10GB",
            "max_age": "1d"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "1d",
        "actions": {
          "set_priority": {
            "priority": 50
          },
          "allocate": {
            "number_of_replicas": 1
          },
          "forcemerge": {
            "max_num_segments": 1
          }
        }
      },
      "delete": {
        "min_age": "30d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}'

Create data stream template for continuous indexing

Set up a data stream template that automatically applies the ILM policy and optimized settings for continuous data ingestion.

curl -X PUT "localhost:9200/_index_template/logs_stream_template" -H "Content-Type: application/json" -d '
{
  "index_patterns": ["logs-app-*"],
  "data_stream": {},
  "template": {
    "settings": {
      "index.lifecycle.name": "logs_policy",
      "index.lifecycle.rollover_alias": "logs-app",
      "number_of_shards": 2,
      "number_of_replicas": 0,
      "refresh_interval": "30s",
      "index.codec": "best_compression"
    },
    "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "message": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "level": {
          "type": "keyword"
        }
      }
    }
  },
  "priority": 200
}'

Verify your setup

Test the optimized configuration by running the bulk indexing script and monitoring performance:

# Test the bulk indexer
source /opt/elasticsearch/venv/bin/activate
python3 /opt/elasticsearch/bulk_indexer.py

Monitor indexing in another terminal
/opt/elasticsearch/monitor_indexing.sh

Check cluster health
curl -s "localhost:9200/_cluster/health" | jq .

Verify index statistics
curl -s "localhost:9200/bulk-test-index/_stats" | jq '.indices[].total.indexing'

Check JVM memory usage
curl -s "localhost:9200/_nodes/stats/jvm" | jq '.nodes[].jvm.mem'

Note: After bulk indexing completes, consider increasing replica count and adjusting refresh intervals for better search performance if needed.

Performance tuning recommendations

Scenario	Batch Size	Workers	Refresh Interval	Replicas
Initial data load	5000-10000	4-8	-1 (disable)	0
Real-time ingestion	1000-2000	2-4	30s-60s	1
Log aggregation	3000-5000	4-6	30s	0-1
Time-series metrics	2000-3000	3-5	30s	1

Common issues

Symptom	Cause	Fix
Bulk requests timing out	Batch size too large or insufficient heap	Reduce batch size to 1000-2000 docs, increase JVM heap
High memory usage during indexing	Index buffer size too high	Reduce `indices.memory.index_buffer_size` to 10%-15%
Thread pool rejections	Too many concurrent bulk requests	Increase `thread_pool.write.queue_size` or reduce workers
Slow indexing performance	Too many replicas during bulk load	Set replicas to 0 during indexing, increase after completion
OutOfMemoryError	JVM heap too small for workload	Increase heap size but keep under 32GB, monitor GC logs
Disk space issues	Translog and segment files accumulating	Reduce `refresh_interval` and `translog.flush_threshold_size`

Warning: Never disable refresh completely (-1) in production without a plan to re-enable it. Documents won't be searchable until refresh occurs.

Next steps

Configure Jaeger with Elasticsearch backend for application performance monitoring
Set up Grafana dashboards for monitoring Elasticsearch performance metrics
Implement Spark streaming with Kafka for real-time data processing before indexing
Configure Elasticsearch cross-cluster replication for high availability
Implement Elasticsearch snapshot lifecycle management for data backup and archival

Automated install script

Run this to automate the entire setup

install.sh

#!/usr/bin/env bash

set -euo pipefail

# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color

# Usage function
usage() {
    echo "Usage: $0 [OPTIONS]"
    echo "Options:"
    echo "  -h, --help          Show this help message"
    echo "  --heap-size SIZE    Set JVM heap size (default: 16g)"
    echo "  --skip-os-tuning    Skip OS-level optimizations"
    exit 1
}

# Default values
HEAP_SIZE="16g"
SKIP_OS_TUNING=false

# Parse arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        -h|--help)
            usage
            ;;
        --heap-size)
            HEAP_SIZE="$2"
            shift 2
            ;;
        --skip-os-tuning)
            SKIP_OS_TUNING=true
            shift
            ;;
        *)
            echo -e "${RED}Unknown option: $1${NC}"
            usage
            ;;
    esac
done

# Cleanup function for rollback on failure
cleanup() {
    echo -e "${RED}Script failed! Rolling back changes...${NC}"
    if [[ -f /etc/elasticsearch/jvm.options.backup ]]; then
        mv /etc/elasticsearch/jvm.options.backup /etc/elasticsearch/jvm.options
    fi
    if [[ -f /etc/sysctl.conf.backup ]]; then
        mv /etc/sysctl.conf.backup /etc/sysctl.conf
    fi
    exit 1
}

trap cleanup ERR

# Check if running as root
if [[ $EUID -ne 0 ]]; then
    echo -e "${RED}This script must be run as root${NC}"
    exit 1
fi

# Auto-detect distribution
if [[ -f /etc/os-release ]]; then
    . /etc/os-release
    case "$ID" in
        ubuntu|debian) 
            PKG_MGR="apt"
            PKG_INSTALL="apt install -y"
            PKG_UPDATE="apt update"
            ES_CONFIG_DIR="/etc/elasticsearch"
            ES_SERVICE="elasticsearch"
            ;;
        almalinux|rocky|centos|rhel|ol|fedora) 
            PKG_MGR="dnf"
            PKG_INSTALL="dnf install -y"
            PKG_UPDATE="dnf update -y"
            ES_CONFIG_DIR="/etc/elasticsearch"
            ES_SERVICE="elasticsearch"
            ;;
        amzn) 
            PKG_MGR="yum"
            PKG_INSTALL="yum install -y"
            PKG_UPDATE="yum update -y"
            ES_CONFIG_DIR="/etc/elasticsearch"
            ES_SERVICE="elasticsearch"
            ;;
        *)
            echo -e "${RED}Unsupported distribution: $ID${NC}"
            exit 1
            ;;
    esac
else
    echo -e "${RED}Cannot detect distribution${NC}"
    exit 1
fi

echo -e "${GREEN}Starting Elasticsearch 8 performance optimization for $PRETTY_NAME${NC}"

# Check if Elasticsearch is installed
echo -e "${YELLOW}[1/6] Checking prerequisites...${NC}"
if ! command -v elasticsearch &> /dev/null; then
    echo -e "${RED}Elasticsearch is not installed. Please install it first.${NC}"
    exit 1
fi

if ! systemctl is-active --quiet $ES_SERVICE; then
    echo -e "${RED}Elasticsearch service is not running. Starting it...${NC}"
    systemctl start $ES_SERVICE
    sleep 10
fi

# Install required tools
$PKG_INSTALL curl jq

# Configure JVM heap size and GC settings
echo -e "${YELLOW}[2/6] Configuring JVM heap size and garbage collection...${NC}"
if [[ -f "$ES_CONFIG_DIR/jvm.options" ]]; then
    cp "$ES_CONFIG_DIR/jvm.options" "$ES_CONFIG_DIR/jvm.options.backup"
fi

cat > "$ES_CONFIG_DIR/jvm.options.d/performance.options" << EOF
# Heap size configuration
-Xms$HEAP_SIZE
-Xmx$HEAP_SIZE

# G1GC configuration for better large heap performance
-XX:+UseG1GC
-XX:G1HeapRegionSize=32m
-XX:+G1UseAdaptiveIHOP
-XX:G1MixedGCCountTarget=8
-XX:G1HeapWastePercent=5

# GC logging for monitoring
-Xlog:gc*,gc+age=trace,safepoint:gc.log:time,level,tags
-XX:+UnlockDiagnosticVMOptions
-XX:+LogVMOutput
EOF

chown root:elasticsearch "$ES_CONFIG_DIR/jvm.options.d/performance.options"
chmod 644 "$ES_CONFIG_DIR/jvm.options.d/performance.options"

# Configure OS-level performance optimizations
if [[ "$SKIP_OS_TUNING" == false ]]; then
    echo -e "${YELLOW}[3/6] Applying OS-level performance optimizations...${NC}"
    
    # Backup sysctl.conf
    cp /etc/sysctl.conf /etc/sysctl.conf.backup
    
    # Apply kernel parameters
    cat >> /etc/sysctl.conf << EOF

# Elasticsearch performance optimizations
vm.max_map_count=262144
vm.swappiness=1
vm.dirty_ratio=15
vm.dirty_background_ratio=5
net.core.rmem_max=134217728
net.core.wmem_max=134217728
net.ipv4.tcp_rmem=4096 65536 134217728
net.ipv4.tcp_wmem=4096 65536 134217728
EOF
    
    # Apply settings immediately
    sysctl -p
fi

# Restart Elasticsearch to apply JVM changes
echo -e "${YELLOW}[4/6] Restarting Elasticsearch to apply JVM settings...${NC}"
systemctl restart $ES_SERVICE
sleep 15

# Wait for Elasticsearch to be ready
echo "Waiting for Elasticsearch to be ready..."
for i in {1..30}; do
    if curl -s http://localhost:9200/_cluster/health >/dev/null 2>&1; then
        break
    fi
    sleep 2
done

# Configure cluster settings for bulk operations
echo -e "${YELLOW}[5/6] Optimizing cluster settings for bulk operations...${NC}"
curl -X PUT "localhost:9200/_cluster/settings" -H "Content-Type: application/json" -d '{
  "persistent": {
    "thread_pool.write.queue_size": 1000,
    "thread_pool.write.size": 8,
    "indices.memory.index_buffer_size": "20%",
    "indices.memory.min_index_buffer_size": "96mb",
    "cluster.routing.allocation.disk.watermark.low": "85%",
    "cluster.routing.allocation.disk.watermark.high": "90%",
    "cluster.routing.allocation.disk.watermark.flood_stage": "95%"
  }
}' || echo -e "${YELLOW}Warning: Could not update cluster settings. Elasticsearch may still be starting.${NC}"

# Create optimized index template
echo -e "${YELLOW}[6/6] Creating optimized index template for bulk indexing...${NC}"
curl -X PUT "localhost:9200/_index_template/bulk_optimized_template" -H "Content-Type: application/json" -d '{
  "index_patterns": ["logs-*", "metrics-*", "bulk-*"],
  "template": {
    "settings": {
      "number_of_shards": 2,
      "number_of_replicas": 0,
      "refresh_interval": "30s",
      "index.translog.flush_threshold_size": "1gb",
      "index.translog.sync_interval": "30s",
      "index.merge.policy.max_merge_at_once": 5,
      "index.merge.policy.segments_per_tier": 5,
      "index.merge.scheduler.max_thread_count": 2,
      "index.codec": "best_compression",
      "index.mapping.total_fields.limit": 10000
    },
    "mappings": {
      "dynamic_templates": [
        {
          "strings_as_keywords": {
            "match_mapping_type": "string",
            "mapping": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        {
          "timestamps": {
            "match": "*timestamp*",
            "mapping": {
              "type": "date",
              "format": "strict_date_optional_time||epoch_millis"
            }
          }
        }
      ]
    }
  }
}' || echo -e "${YELLOW}Warning: Could not create index template. Elasticsearch may still be starting.${NC}"

# Verification checks
echo -e "${GREEN}Running verification checks...${NC}"

# Check if Elasticsearch is responding
if curl -s http://localhost:9200/_cluster/health | jq -e '.status' >/dev/null 2>&1; then
    echo -e "${GREEN}✓ Elasticsearch is responding${NC}"
    CLUSTER_STATUS=$(curl -s http://localhost:9200/_cluster/health | jq -r '.status')
    echo -e "${GREEN}✓ Cluster status: $CLUSTER_STATUS${NC}"
else
    echo -e "${RED}✗ Elasticsearch is not responding properly${NC}"
fi

# Check JVM heap size
HEAP_INFO=$(curl -s http://localhost:9200/_nodes/stats/jvm | jq -r '.nodes | to_entries[0].value.jvm.mem.heap_max_in_bytes' 2>/dev/null || echo "unknown")
if [[ "$HEAP_INFO" != "unknown" ]]; then
    HEAP_GB=$((HEAP_INFO / 1024 / 1024 / 1024))
    echo -e "${GREEN}✓ JVM heap size configured: ${HEAP_GB}GB${NC}"
else
    echo -e "${YELLOW}⚠ Could not verify JVM heap size${NC}"
fi

# Check if template was created
if curl -s http://localhost:9200/_index_template/bulk_optimized_template >/dev/null 2>&1; then
    echo -e "${GREEN}✓ Bulk optimized template created successfully${NC}"
else
    echo -e "${YELLOW}⚠ Could not verify template creation${NC}"
fi

echo -e "${GREEN}Elasticsearch 8 performance optimization completed!${NC}"
echo ""
echo -e "${YELLOW}Next steps:${NC}"
echo "1. Monitor GC logs in Elasticsearch logs directory"
echo "2. Use bulk API with batch sizes of 5-15MB for optimal performance"
echo "3. Monitor cluster performance with: curl localhost:9200/_cluster/stats"
echo "4. Consider adding replicas after bulk indexing is complete"

Review the script before running. Execute with: bash install.sh

#elasticsearch #indexing #performance #bulk-operations #memory-tuning

Optimize Elasticsearch 8 indexing performance for large datasets with bulk operations and memory tuning

Prerequisites

What this solves

Prerequisites and system requirements

Step-by-step performance optimization

Configure JVM heap size for optimal memory usage

Enable G1GC for better large heap performance

Optimize GC logging for monitoring

Optimize Elasticsearch cluster settings for bulk operations

Create optimized index templates for bulk indexing

Configure OS-level performance optimizations

Network settings

File system settings

Set file descriptor limits for Elasticsearch user

Configure systemd service limits

Create bulk indexing script with optimal batch sizes

Example usage

Install Python dependencies for bulk indexing

Configure monitoring for indexing performance

Optimize index settings for specific use cases

Configure time-based indices for log data

Create data stream template for continuous indexing

Verify your setup

Monitor indexing in another terminal

Check cluster health

Verify index statistics

Check JVM memory usage

Performance tuning recommendations

Common issues

Next steps

Related tutorials

Configure H2O HTTP/2 web server caching and compression optimization for high performance

Configure Tomcat 11 database connection pooling with HikariCP and JNDI optimization for high-performance applications

Configure Linux process scheduling and CPU affinity for performance optimization

Don't want to manage this yourself?