Set up a production-ready ScyllaDB cluster with multi-node configuration, automatic replication, and performance optimization. ScyllaDB provides 10x better performance than Cassandra with drop-in compatibility.
Prerequisites
- 3 or more servers with minimum 8GB RAM each
- Root or sudo access on all nodes
- Network connectivity between cluster nodes
- Basic understanding of NoSQL concepts
What this solves
ScyllaDB is a high-performance NoSQL database that delivers 10x better performance than Apache Cassandra while maintaining full compatibility with Cassandra APIs. This tutorial walks you through setting up a production-ready ScyllaDB cluster with multiple nodes, configuring replication strategies for high availability, and applying performance optimizations for maximum throughput.
Step-by-step installation
Update system packages and install prerequisites
Start by updating your package manager and installing essential dependencies for ScyllaDB installation.
sudo apt update && sudo apt upgrade -y
sudo apt install -y curl gnupg2 software-properties-common apt-transport-https
Add ScyllaDB repository
Add the official ScyllaDB repository to get the latest stable version with security updates.
curl -sSL https://downloads.scylladb.com/deb/ubuntu/scylladb-2024.1.list | sudo tee /etc/apt/sources.list.d/scylladb.list
curl -L https://downloads.scylladb.com/rpm/scylladb-rpm-key.asc | sudo apt-key add -
sudo apt update
Install ScyllaDB server
Install the ScyllaDB server package which includes the database engine and management tools.
sudo apt install -y scylla
Configure system limits and kernel parameters
ScyllaDB requires specific system configurations for optimal performance. These settings ensure proper memory management and network handling.
scylla soft memlock unlimited
scylla hard memlock unlimited
scylla soft nofile 800000
scylla hard nofile 800000
scylla soft as unlimited
scylla hard as unlimited
scylla soft nproc 8096
scylla hard nproc 8096
sudo sysctl -w net.core.rmem_max=134217728
sudo sysctl -w net.core.wmem_max=134217728
sudo sysctl -w net.ipv4.tcp_rmem="4096 65536 134217728"
sudo sysctl -w net.ipv4.tcp_wmem="4096 65536 134217728"
echo 'net.core.rmem_max = 134217728' | sudo tee -a /etc/sysctl.conf
echo 'net.core.wmem_max = 134217728' | sudo tee -a /etc/sysctl.conf
echo 'net.ipv4.tcp_rmem = 4096 65536 134217728' | sudo tee -a /etc/sysctl.conf
echo 'net.ipv4.tcp_wmem = 4096 65536 134217728' | sudo tee -a /etc/sysctl.conf
Configure ScyllaDB cluster settings
Configure the main ScyllaDB configuration file for cluster operation. Replace the IP addresses with your actual server IPs.
sudo cp /etc/scylla/scylla.yaml /etc/scylla/scylla.yaml.backup
# Cluster configuration
cluster_name: 'ScyllaDB Production Cluster'
num_tokens: 256
Network settings
listen_address: 203.0.113.10
rpc_address: 203.0.113.10
broadcast_address: 203.0.113.10
broadcast_rpc_address: 203.0.113.10
Seeds for cluster discovery
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "203.0.113.10,203.0.113.11,203.0.113.12"
Data directories
data_file_directories:
- /var/lib/scylla/data
commitlog_directory: /var/lib/scylla/commitlog
hints_directory: /var/lib/scylla/hints
view_hints_directory: /var/lib/scylla/view_hints
Authentication and authorization
authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer
role_manager: CassandraRoleManager
Performance settings
concurrent_reads: 32
concurrent_writes: 32
concurrent_counter_writes: 32
concurrent_materialized_view_writes: 32
Timeouts
read_request_timeout_in_ms: 5000
write_request_timeout_in_ms: 2000
counter_write_request_timeout_in_ms: 5000
range_request_timeout_in_ms: 10000
Compaction settings
compaction_throughput_mb_per_sec: 64
compaction_large_partition_warning_threshold_mb: 1000
Enable experimental features
experimental_features:
- udf
- alternator-streams
Set up ScyllaDB data directories with correct permissions
Create and configure data directories with proper ownership. ScyllaDB runs as the scylla user for security.
sudo mkdir -p /var/lib/scylla/data /var/lib/scylla/commitlog /var/lib/scylla/hints /var/lib/scylla/view_hints
sudo chown -R scylla:scylla /var/lib/scylla
sudo chmod -R 755 /var/lib/scylla
Configure firewall rules for cluster communication
Open the required ports for ScyllaDB cluster communication, client connections, and monitoring.
sudo ufw allow 7000/tcp comment 'ScyllaDB inter-node communication'
sudo ufw allow 7001/tcp comment 'ScyllaDB inter-node SSL communication'
sudo ufw allow 9042/tcp comment 'ScyllaDB CQL native transport'
sudo ufw allow 9160/tcp comment 'ScyllaDB Thrift RPC'
sudo ufw allow 10000/tcp comment 'ScyllaDB REST API'
sudo ufw allow 19042/tcp comment 'ScyllaDB shard-aware port'
sudo ufw reload
Run ScyllaDB system optimization
Use ScyllaDB's built-in setup script to optimize system parameters for maximum performance.
sudo scylla_setup --no-raid-setup --no-fstrim-setup --no-kernel-check --no-verify-package --no-enable-service --no-selinux-setup
sudo scylla_io_setup
Enable and start ScyllaDB service
Enable ScyllaDB to start automatically on boot and start the service on the first node.
sudo systemctl enable scylla-server
sudo systemctl start scylla-server
sudo systemctl status scylla-server
Configure additional cluster nodes
Repeat the installation steps on additional nodes, but modify the listen_address and broadcast_address in scylla.yaml for each node. Wait for the first node to fully start before adding others.
listen_address: 203.0.113.11
rpc_address: 203.0.113.11
broadcast_address: 203.0.113.11
broadcast_rpc_address: 203.0.113.11
listen_address: 203.0.113.12
rpc_address: 203.0.113.12
broadcast_address: 203.0.113.12
broadcast_rpc_address: 203.0.113.12
Set up keyspaces with replication
Create keyspaces with proper replication strategies for high availability. Connect using cqlsh and create your application keyspaces.
cqlsh 203.0.113.10 -u cassandra -p cassandra
CREATE KEYSPACE production WITH REPLICATION = {
'class': 'NetworkTopologyStrategy',
'datacenter1': 3
} AND DURABLE_WRITES = true;
CREATE KEYSPACE analytics WITH REPLICATION = {
'class': 'SimpleStrategy',
'replication_factor': 2
} AND DURABLE_WRITES = true;
-- Change default superuser password
ALTER ROLE cassandra WITH PASSWORD = 'SecureP@ssw0rd123!';
Configure performance monitoring
Enable ScyllaDB monitoring and metrics collection for performance tracking. This integrates with existing monitoring solutions.
# Add to existing configuration
Metrics and monitoring
metrics_update_interval_in_ms: 60000
prometheus_port: 9180
Enable detailed metrics
histogram_collection_enabled: true
sudo systemctl restart scylla-server
Configure replication and consistency
Set up replication strategies
Configure different replication strategies based on your availability requirements and data center topology.
-- For single datacenter with 3 replicas
CREATE KEYSPACE user_data WITH REPLICATION = {
'class': 'SimpleStrategy',
'replication_factor': 3
};
-- For multi-datacenter setup
CREATE KEYSPACE global_data WITH REPLICATION = {
'class': 'NetworkTopologyStrategy',
'dc1': 2,
'dc2': 1
};
-- Example table with optimal settings
USE user_data;
CREATE TABLE user_profiles (
user_id UUID PRIMARY KEY,
username TEXT,
email TEXT,
created_at TIMESTAMP,
last_login TIMESTAMP
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = 'User profile data'
AND compaction = {'class': 'LeveledCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND default_time_to_live = 0
AND gc_grace_seconds = 864000;
Configure consistency levels
Set appropriate consistency levels for read and write operations to balance performance and data consistency.
-- Set session consistency
CONSISTENCY QUORUM;
-- Test write with different consistency
INSERT INTO user_profiles (user_id, username, email, created_at)
VALUES (uuid(), 'testuser', 'test@example.com', toTimestamp(now()))
USING CONSISTENCY LOCAL_QUORUM;
-- Test read with consistency
SELECT * FROM user_profiles LIMIT 1
USING CONSISTENCY LOCAL_ONE;
Performance tuning optimization
Optimize memory settings
Configure memory allocation for optimal performance based on your server specifications.
# Memory settings (adjust based on available RAM)
For servers with 32GB+ RAM
memtable_heap_space_in_mb: 2048
memtable_offheap_space_in_mb: 2048
Row cache (use with caution)
row_cache_size_in_mb: 512
row_cache_save_period: 14400
Key cache
key_cache_size_in_mb: 256
key_cache_save_period: 14400
Configure I/O optimization
Optimize disk I/O settings for your storage backend. These settings significantly impact write performance.
# I/O optimization
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32
Flush settings
memtable_flush_writers: 2
concurrent_compactors: 4
Batch settings
batch_size_warn_threshold_in_kb: 64
batch_size_fail_threshold_in_kb: 640
unlogged_batch_across_partitions_warn_threshold: 100
Enable JMX monitoring
Configure JMX for advanced monitoring and management. This allows integration with monitoring tools.
# JMX configuration
jmx_port: 7199
local_jmx: false
Additional monitoring
incremental_backups: true
snapshot_before_compaction: false
sudo systemctl restart scylla-server
Verify your setup
# Check cluster status
nodetool status
Check cluster information
nodetool info
Test CQL connection
cqlsh 203.0.113.10 -u cassandra -p 'SecureP@ssw0rd123!'
Check keyspace replication
cqlsh -e "DESCRIBE KEYSPACES;"
Monitor cluster performance
nodetool cfstats
nodetool tpstats
You can also access ScyllaDB monitoring through the web interface at http://203.0.113.10:10000 and Prometheus metrics at http://203.0.113.10:9180/metrics.
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| Node won't join cluster | Network connectivity or firewall | Check firewall rules and test port connectivity with telnet 203.0.113.10 7000 |
| High memory usage | Memtable settings too high | Reduce memtable_heap_space_in_mb and memtable_offheap_space_in_mb values |
| Slow write performance | Insufficient concurrent writers | Increase concurrent_writes and memtable_flush_writers in scylla.yaml |
| Authentication failures | Default credentials not changed | Use ALTER ROLE cassandra WITH PASSWORD to set secure password |
| Disk space issues | Compaction not running | Run nodetool compact and check compaction_throughput_mb_per_sec |
| Connection timeout | Network latency or overload | Increase read_request_timeout_in_ms and write_request_timeout_in_ms |
Next steps
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
# Script variables
CLUSTER_NAME="ScyllaDB Production Cluster"
DEFAULT_SEEDS="127.0.0.1"
# Cleanup function
cleanup() {
echo -e "${RED}[ERROR] Installation failed. Cleaning up...${NC}"
systemctl stop scylla 2>/dev/null || true
if [ "$PKG_MGR" = "apt" ]; then
apt-get remove --purge -y scylla 2>/dev/null || true
rm -f /etc/apt/sources.list.d/scylladb.list
else
$PKG_INSTALL remove -y scylla 2>/dev/null || true
rm -f /etc/yum.repos.d/scylladb.repo
fi
rm -f /etc/security/limits.d/scylla.conf
exit 1
}
trap cleanup ERR
usage() {
echo "Usage: $0 [OPTIONS]"
echo "Options:"
echo " -i, --ip IP Node IP address (default: auto-detect)"
echo " -s, --seeds SEEDS Comma-separated seed IPs (default: 127.0.0.1)"
echo " -c, --cluster NAME Cluster name (default: 'ScyllaDB Production Cluster')"
echo " -h, --help Show this help"
exit 1
}
# Parse arguments
while [[ $# -gt 0 ]]; do
case $1 in
-i|--ip)
NODE_IP="$2"
shift 2
;;
-s|--seeds)
SEED_NODES="$2"
shift 2
;;
-c|--cluster)
CLUSTER_NAME="$2"
shift 2
;;
-h|--help)
usage
;;
*)
echo "Unknown option $1"
usage
;;
esac
done
# Check if running as root
if [[ $EUID -ne 0 ]]; then
echo -e "${RED}This script must be run as root${NC}"
exit 1
fi
# Auto-detect distro
if [ -f /etc/os-release ]; then
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_INSTALL="apt-get install -y"
PKG_UPDATE="apt-get update"
PKG_UPGRADE="apt-get upgrade -y"
;;
almalinux|rocky|centos|rhel|ol|fedora)
PKG_MGR="dnf"
PKG_INSTALL="dnf install -y"
PKG_UPDATE="dnf check-update || true"
PKG_UPGRADE="dnf upgrade -y"
;;
amzn)
PKG_MGR="yum"
PKG_INSTALL="yum install -y"
PKG_UPDATE="yum check-update || true"
PKG_UPGRADE="yum update -y"
;;
*)
echo -e "${RED}Unsupported distro: $ID${NC}"
exit 1
;;
esac
else
echo -e "${RED}Cannot detect OS distribution${NC}"
exit 1
fi
# Auto-detect IP if not provided
if [ -z "${NODE_IP:-}" ]; then
NODE_IP=$(ip route get 8.8.8.8 | awk '{print $7; exit}' 2>/dev/null || echo "127.0.0.1")
fi
# Set default seeds if not provided
SEED_NODES="${SEED_NODES:-$DEFAULT_SEEDS}"
echo -e "${GREEN}ScyllaDB Cluster Installation Script${NC}"
echo "Detected OS: $PRETTY_NAME"
echo "Node IP: $NODE_IP"
echo "Seed nodes: $SEED_NODES"
echo "Cluster name: $CLUSTER_NAME"
echo
# Step 1: Update system packages
echo -e "${YELLOW}[1/8] Updating system packages...${NC}"
$PKG_UPDATE
$PKG_UPGRADE
# Step 2: Install prerequisites
echo -e "${YELLOW}[2/8] Installing prerequisites...${NC}"
if [ "$PKG_MGR" = "apt" ]; then
$PKG_INSTALL curl gnupg2 software-properties-common apt-transport-https
else
$PKG_INSTALL curl gnupg2 wget
fi
# Step 3: Add ScyllaDB repository
echo -e "${YELLOW}[3/8] Adding ScyllaDB repository...${NC}"
if [ "$PKG_MGR" = "apt" ]; then
curl -sSL https://downloads.scylladb.com/deb/ubuntu/scylladb-2024.1.list > /etc/apt/sources.list.d/scylladb.list
curl -L https://downloads.scylladb.com/rpm/scylladb-rpm-key.asc | apt-key add -
apt-get update
else
curl -o /etc/yum.repos.d/scylladb.repo -L https://downloads.scylladb.com/rpm/centos/scylladb-2024.1.repo
rpm --import https://downloads.scylladb.com/rpm/scylladb-rpm-key.asc
fi
# Step 4: Install ScyllaDB
echo -e "${YELLOW}[4/8] Installing ScyllaDB server...${NC}"
$PKG_INSTALL scylla
# Step 5: Configure system limits
echo -e "${YELLOW}[5/8] Configuring system limits and kernel parameters...${NC}"
cat > /etc/security/limits.d/scylla.conf << 'EOF'
scylla soft memlock unlimited
scylla hard memlock unlimited
scylla soft nofile 800000
scylla hard nofile 800000
scylla soft as unlimited
scylla hard as unlimited
scylla soft nproc 8096
scylla hard nproc 8096
EOF
# Configure kernel parameters
sysctl -w net.core.rmem_max=134217728
sysctl -w net.core.wmem_max=134217728
sysctl -w net.ipv4.tcp_rmem="4096 65536 134217728"
sysctl -w net.ipv4.tcp_wmem="4096 65536 134217728"
cat >> /etc/sysctl.conf << 'EOF'
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 65536 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728
EOF
# Step 6: Set up data directories
echo -e "${YELLOW}[6/8] Setting up ScyllaDB data directories...${NC}"
mkdir -p /var/lib/scylla/{data,commitlog,hints,view_hints}
chown -R scylla:scylla /var/lib/scylla
chmod 755 /var/lib/scylla
chmod 755 /var/lib/scylla/{data,commitlog,hints,view_hints}
# Step 7: Configure ScyllaDB
echo -e "${YELLOW}[7/8] Configuring ScyllaDB cluster settings...${NC}"
cp /etc/scylla/scylla.yaml /etc/scylla/scylla.yaml.backup
cat > /etc/scylla/scylla.yaml << EOF
# Cluster configuration
cluster_name: '$CLUSTER_NAME'
num_tokens: 256
# Network settings
listen_address: $NODE_IP
rpc_address: $NODE_IP
broadcast_address: $NODE_IP
broadcast_rpc_address: $NODE_IP
# Seeds for cluster discovery
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "$SEED_NODES"
# Data directories
data_file_directories:
- /var/lib/scylla/data
commitlog_directory: /var/lib/scylla/commitlog
hints_directory: /var/lib/scylla/hints
view_hints_directory: /var/lib/scylla/view_hints
# Authentication and authorization
authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer
role_manager: CassandraRoleManager
# Performance settings
concurrent_reads: 32
concurrent_writes: 32
concurrent_counter_writes: 32
concurrent_materialized_view_writes: 32
# Timeouts
read_request_timeout_in_ms: 5000
write_request_timeout_in_ms: 2000
counter_write_request_timeout_in_ms: 5000
range_request_timeout_in_ms: 10000
# Compaction settings
compaction_throughput_mb_per_sec: 64
compaction_large_partition_warning_threshold_mb: 1000
# Enable experimental features
experimental_features:
- udf
- alternator-streams
# Additional performance optimizations
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
memtable_allocation_type: heap_buffers
EOF
chown scylla:scylla /etc/scylla/scylla.yaml
chmod 644 /etc/scylla/scylla.yaml
# Step 8: Start and enable ScyllaDB
echo -e "${YELLOW}[8/8] Starting ScyllaDB service...${NC}"
systemctl enable scylla
systemctl start scylla
# Wait for service to start
sleep 10
# Verification
echo -e "${YELLOW}Verifying installation...${NC}"
if systemctl is-active --quiet scylla; then
echo -e "${GREEN}✓ ScyllaDB service is running${NC}"
else
echo -e "${RED}✗ ScyllaDB service is not running${NC}"
exit 1
fi
if ss -tln | grep -q ":9042"; then
echo -e "${GREEN}✓ ScyllaDB is listening on port 9042${NC}"
else
echo -e "${RED}✗ ScyllaDB is not listening on port 9042${NC}"
exit 1
fi
echo
echo -e "${GREEN}ScyllaDB installation completed successfully!${NC}"
echo
echo "Next steps:"
echo "1. Repeat this installation on other cluster nodes"
echo "2. Connect using cqlsh: cqlsh $NODE_IP"
echo "3. Create a keyspace with replication strategy"
echo "4. Monitor logs: journalctl -u scylla -f"
echo
echo -e "${YELLOW}Note: Default superuser credentials will be generated on first startup.${NC}"
echo -e "${YELLOW}Check /var/log/scylla/ for generated passwords.${NC}"
Review the script before running. Execute with: bash install.sh