Install and configure ScyllaDB cluster with replication and performance tuning

Intermediate 45 min Apr 03, 2026 22 views
Ubuntu 24.04 Ubuntu 22.04 Debian 12 AlmaLinux 9 Rocky Linux 9 Fedora 41

Set up a production-ready ScyllaDB cluster with multi-node configuration, automatic replication, and performance optimization. ScyllaDB provides 10x better performance than Cassandra with drop-in compatibility.

Prerequisites

  • 3 or more servers with minimum 8GB RAM each
  • Root or sudo access on all nodes
  • Network connectivity between cluster nodes
  • Basic understanding of NoSQL concepts

What this solves

ScyllaDB is a high-performance NoSQL database that delivers 10x better performance than Apache Cassandra while maintaining full compatibility with Cassandra APIs. This tutorial walks you through setting up a production-ready ScyllaDB cluster with multiple nodes, configuring replication strategies for high availability, and applying performance optimizations for maximum throughput.

Step-by-step installation

Update system packages and install prerequisites

Start by updating your package manager and installing essential dependencies for ScyllaDB installation.

sudo apt update && sudo apt upgrade -y
sudo apt install -y curl gnupg2 software-properties-common apt-transport-https
sudo dnf update -y
sudo dnf install -y curl gnupg2 wget

Add ScyllaDB repository

Add the official ScyllaDB repository to get the latest stable version with security updates.

curl -sSL https://downloads.scylladb.com/deb/ubuntu/scylladb-2024.1.list | sudo tee /etc/apt/sources.list.d/scylladb.list
curl -L https://downloads.scylladb.com/rpm/scylladb-rpm-key.asc | sudo apt-key add -
sudo apt update
sudo curl -o /etc/yum.repos.d/scylladb.repo -L https://downloads.scylladb.com/rpm/centos/scylladb-2024.1.repo
sudo rpm --import https://downloads.scylladb.com/rpm/scylladb-rpm-key.asc

Install ScyllaDB server

Install the ScyllaDB server package which includes the database engine and management tools.

sudo apt install -y scylla
sudo dnf install -y scylla

Configure system limits and kernel parameters

ScyllaDB requires specific system configurations for optimal performance. These settings ensure proper memory management and network handling.

scylla soft memlock unlimited
scylla hard memlock unlimited
scylla soft nofile 800000
scylla hard nofile 800000
scylla soft as unlimited
scylla hard as unlimited
scylla soft nproc 8096
scylla hard nproc 8096
sudo sysctl -w net.core.rmem_max=134217728
sudo sysctl -w net.core.wmem_max=134217728
sudo sysctl -w net.ipv4.tcp_rmem="4096 65536 134217728"
sudo sysctl -w net.ipv4.tcp_wmem="4096 65536 134217728"
echo 'net.core.rmem_max = 134217728' | sudo tee -a /etc/sysctl.conf
echo 'net.core.wmem_max = 134217728' | sudo tee -a /etc/sysctl.conf
echo 'net.ipv4.tcp_rmem = 4096 65536 134217728' | sudo tee -a /etc/sysctl.conf
echo 'net.ipv4.tcp_wmem = 4096 65536 134217728' | sudo tee -a /etc/sysctl.conf

Configure ScyllaDB cluster settings

Configure the main ScyllaDB configuration file for cluster operation. Replace the IP addresses with your actual server IPs.

sudo cp /etc/scylla/scylla.yaml /etc/scylla/scylla.yaml.backup
# Cluster configuration
cluster_name: 'ScyllaDB Production Cluster'
num_tokens: 256

Network settings

listen_address: 203.0.113.10 rpc_address: 203.0.113.10 broadcast_address: 203.0.113.10 broadcast_rpc_address: 203.0.113.10

Seeds for cluster discovery

seed_provider: - class_name: org.apache.cassandra.locator.SimpleSeedProvider parameters: - seeds: "203.0.113.10,203.0.113.11,203.0.113.12"

Data directories

data_file_directories: - /var/lib/scylla/data commitlog_directory: /var/lib/scylla/commitlog hints_directory: /var/lib/scylla/hints view_hints_directory: /var/lib/scylla/view_hints

Authentication and authorization

authenticator: PasswordAuthenticator authorizer: CassandraAuthorizer role_manager: CassandraRoleManager

Performance settings

concurrent_reads: 32 concurrent_writes: 32 concurrent_counter_writes: 32 concurrent_materialized_view_writes: 32

Timeouts

read_request_timeout_in_ms: 5000 write_request_timeout_in_ms: 2000 counter_write_request_timeout_in_ms: 5000 range_request_timeout_in_ms: 10000

Compaction settings

compaction_throughput_mb_per_sec: 64 compaction_large_partition_warning_threshold_mb: 1000

Enable experimental features

experimental_features: - udf - alternator-streams

Set up ScyllaDB data directories with correct permissions

Create and configure data directories with proper ownership. ScyllaDB runs as the scylla user for security.

sudo mkdir -p /var/lib/scylla/data /var/lib/scylla/commitlog /var/lib/scylla/hints /var/lib/scylla/view_hints
sudo chown -R scylla:scylla /var/lib/scylla
sudo chmod -R 755 /var/lib/scylla
Never use chmod 777. It gives every user on the system full access to your database files. The scylla user needs ownership, not world permissions.

Configure firewall rules for cluster communication

Open the required ports for ScyllaDB cluster communication, client connections, and monitoring.

sudo ufw allow 7000/tcp comment 'ScyllaDB inter-node communication'
sudo ufw allow 7001/tcp comment 'ScyllaDB inter-node SSL communication'
sudo ufw allow 9042/tcp comment 'ScyllaDB CQL native transport'
sudo ufw allow 9160/tcp comment 'ScyllaDB Thrift RPC'
sudo ufw allow 10000/tcp comment 'ScyllaDB REST API'
sudo ufw allow 19042/tcp comment 'ScyllaDB shard-aware port'
sudo ufw reload
sudo firewall-cmd --permanent --add-port=7000/tcp --add-port=7001/tcp --add-port=9042/tcp --add-port=9160/tcp --add-port=10000/tcp --add-port=19042/tcp
sudo firewall-cmd --reload

Run ScyllaDB system optimization

Use ScyllaDB's built-in setup script to optimize system parameters for maximum performance.

sudo scylla_setup --no-raid-setup --no-fstrim-setup --no-kernel-check --no-verify-package --no-enable-service --no-selinux-setup
sudo scylla_io_setup

Enable and start ScyllaDB service

Enable ScyllaDB to start automatically on boot and start the service on the first node.

sudo systemctl enable scylla-server
sudo systemctl start scylla-server
sudo systemctl status scylla-server

Configure additional cluster nodes

Repeat the installation steps on additional nodes, but modify the listen_address and broadcast_address in scylla.yaml for each node. Wait for the first node to fully start before adding others.

listen_address: 203.0.113.11
rpc_address: 203.0.113.11
broadcast_address: 203.0.113.11
broadcast_rpc_address: 203.0.113.11
listen_address: 203.0.113.12
rpc_address: 203.0.113.12
broadcast_address: 203.0.113.12
broadcast_rpc_address: 203.0.113.12

Set up keyspaces with replication

Create keyspaces with proper replication strategies for high availability. Connect using cqlsh and create your application keyspaces.

cqlsh 203.0.113.10 -u cassandra -p cassandra
CREATE KEYSPACE production WITH REPLICATION = {
  'class': 'NetworkTopologyStrategy',
  'datacenter1': 3
} AND DURABLE_WRITES = true;

CREATE KEYSPACE analytics WITH REPLICATION = {
  'class': 'SimpleStrategy',
  'replication_factor': 2
} AND DURABLE_WRITES = true;

-- Change default superuser password
ALTER ROLE cassandra WITH PASSWORD = 'SecureP@ssw0rd123!';

Configure performance monitoring

Enable ScyllaDB monitoring and metrics collection for performance tracking. This integrates with existing monitoring solutions.

# Add to existing configuration

Metrics and monitoring

metrics_update_interval_in_ms: 60000 prometheus_port: 9180

Enable detailed metrics

histogram_collection_enabled: true
sudo systemctl restart scylla-server

Configure replication and consistency

Set up replication strategies

Configure different replication strategies based on your availability requirements and data center topology.

-- For single datacenter with 3 replicas
CREATE KEYSPACE user_data WITH REPLICATION = {
  'class': 'SimpleStrategy',
  'replication_factor': 3
};

-- For multi-datacenter setup
CREATE KEYSPACE global_data WITH REPLICATION = {
  'class': 'NetworkTopologyStrategy',
  'dc1': 2,
  'dc2': 1
};

-- Example table with optimal settings
USE user_data;
CREATE TABLE user_profiles (
    user_id UUID PRIMARY KEY,
    username TEXT,
    email TEXT,
    created_at TIMESTAMP,
    last_login TIMESTAMP
) WITH bloom_filter_fp_chance = 0.01
   AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
   AND comment = 'User profile data'
   AND compaction = {'class': 'LeveledCompactionStrategy'}
   AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
   AND default_time_to_live = 0
   AND gc_grace_seconds = 864000;

Configure consistency levels

Set appropriate consistency levels for read and write operations to balance performance and data consistency.

-- Set session consistency
CONSISTENCY QUORUM;

-- Test write with different consistency
INSERT INTO user_profiles (user_id, username, email, created_at) 
VALUES (uuid(), 'testuser', 'test@example.com', toTimestamp(now()))
USING CONSISTENCY LOCAL_QUORUM;

-- Test read with consistency
SELECT * FROM user_profiles LIMIT 1
USING CONSISTENCY LOCAL_ONE;

Performance tuning optimization

Optimize memory settings

Configure memory allocation for optimal performance based on your server specifications.

# Memory settings (adjust based on available RAM)

For servers with 32GB+ RAM

memtable_heap_space_in_mb: 2048 memtable_offheap_space_in_mb: 2048

Row cache (use with caution)

row_cache_size_in_mb: 512 row_cache_save_period: 14400

Key cache

key_cache_size_in_mb: 256 key_cache_save_period: 14400

Configure I/O optimization

Optimize disk I/O settings for your storage backend. These settings significantly impact write performance.

# I/O optimization
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32

Flush settings

memtable_flush_writers: 2 concurrent_compactors: 4

Batch settings

batch_size_warn_threshold_in_kb: 64 batch_size_fail_threshold_in_kb: 640 unlogged_batch_across_partitions_warn_threshold: 100

Enable JMX monitoring

Configure JMX for advanced monitoring and management. This allows integration with monitoring tools.

# JMX configuration
jmx_port: 7199
local_jmx: false

Additional monitoring

incremental_backups: true snapshot_before_compaction: false
sudo systemctl restart scylla-server

Verify your setup

# Check cluster status
nodetool status

Check cluster information

nodetool info

Test CQL connection

cqlsh 203.0.113.10 -u cassandra -p 'SecureP@ssw0rd123!'

Check keyspace replication

cqlsh -e "DESCRIBE KEYSPACES;"

Monitor cluster performance

nodetool cfstats nodetool tpstats

You can also access ScyllaDB monitoring through the web interface at http://203.0.113.10:10000 and Prometheus metrics at http://203.0.113.10:9180/metrics.

Common issues

SymptomCauseFix
Node won't join clusterNetwork connectivity or firewallCheck firewall rules and test port connectivity with telnet 203.0.113.10 7000
High memory usageMemtable settings too highReduce memtable_heap_space_in_mb and memtable_offheap_space_in_mb values
Slow write performanceInsufficient concurrent writersIncrease concurrent_writes and memtable_flush_writers in scylla.yaml
Authentication failuresDefault credentials not changedUse ALTER ROLE cassandra WITH PASSWORD to set secure password
Disk space issuesCompaction not runningRun nodetool compact and check compaction_throughput_mb_per_sec
Connection timeoutNetwork latency or overloadIncrease read_request_timeout_in_ms and write_request_timeout_in_ms

Next steps

Automated install script

Run this to automate the entire setup

#scylladb #scylla cluster #cassandra alternative #nosql database #distributed database

Need help?

Don't want to manage this yourself?

We handle infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.

Talk to an engineer