Set up MongoDB sharding with geographic zones to distribute data based on location, ensuring optimal performance for global applications and regulatory compliance.
Prerequisites
- At least 9 servers (3 config, 6+ shard members)
- MongoDB 8.0 or higher
- Network connectivity between all servers
- Basic understanding of MongoDB concepts
What this solves
MongoDB sharding distributes data across multiple servers to handle large datasets and high throughput. Zone-based sharding adds geographic awareness, placing data in specific regions based on configurable rules. This ensures low latency for users, meets data sovereignty requirements, and optimizes resource utilization for global applications.
Step-by-step configuration
Install MongoDB on all servers
Install MongoDB Community Server on the servers that will host config servers, mongos routers, and shard members.
curl -fsSL https://www.mongodb.org/static/pgp/server-8.0.asc | sudo gpg --dearmor -o /usr/share/keyrings/mongodb-server-8.0.gpg
echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-8.0.gpg ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/8.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-8.0.list
sudo apt update
sudo apt install -y mongodb-org
Configure config servers
Set up three config servers for metadata storage. Config servers store cluster metadata and shard configuration.
systemLog:
destination: file
path: /var/log/mongodb/mongod-config.log
logAppend: true
storage:
dbPath: /var/lib/mongodb-config
journal:
enabled: true
processManagement:
fork: true
pidFilePath: /var/run/mongodb/mongod-config.pid
net:
port: 27019
bindIp: 0.0.0.0
replication:
replSetName: configReplSet
sharding:
clusterRole: configsvr
sudo mkdir -p /var/lib/mongodb-config
sudo chown mongodb:mongodb /var/lib/mongodb-config
sudo systemctl start mongod -f /etc/mongod-config.conf
Initialize config server replica set
Connect to one config server and initialize the replica set with all three config servers.
mongosh --port 27019
rs.initiate({
_id: "configReplSet",
configsvr: true,
members: [
{ _id: 0, host: "10.0.1.10:27019" },
{ _id: 1, host: "10.0.1.11:27019" },
{ _id: 2, host: "10.0.1.12:27019" }
]
})
Configure shard servers
Set up shard servers in different geographic regions. Each shard should be a replica set for high availability.
systemLog:
destination: file
path: /var/log/mongodb/mongod-shard.log
logAppend: true
storage:
dbPath: /var/lib/mongodb-shard
journal:
enabled: true
processManagement:
fork: true
pidFilePath: /var/run/mongodb/mongod-shard.pid
net:
port: 27018
bindIp: 0.0.0.0
replication:
replSetName: shard01
sharding:
clusterRole: shardsvr
sudo mkdir -p /var/lib/mongodb-shard
sudo chown mongodb:mongodb /var/lib/mongodb-shard
sudo systemctl start mongod -f /etc/mongod-shard.conf
Initialize shard replica sets
Initialize replica sets for each geographic region. Repeat this for each shard in different regions.
mongosh --port 27018
rs.initiate({
_id: "shard01",
members: [
{ _id: 0, host: "10.0.2.10:27018" },
{ _id: 1, host: "10.0.2.11:27018" },
{ _id: 2, host: "10.0.2.12:27018" }
]
})
Configure mongos routers
Set up mongos routers that client applications will connect to. These route queries to appropriate shards.
systemLog:
destination: file
path: /var/log/mongodb/mongos.log
logAppend: true
processManagement:
fork: true
pidFilePath: /var/run/mongodb/mongos.pid
net:
port: 27017
bindIp: 0.0.0.0
sharding:
configDB: configReplSet/10.0.1.10:27019,10.0.1.11:27019,10.0.1.12:27019
sudo mongos -f /etc/mongos.conf
Add shards to cluster
Connect to mongos and add each shard replica set to the cluster.
mongosh --port 27017
sh.addShard("shard01/10.0.2.10:27018,10.0.2.11:27018,10.0.2.12:27018")
sh.addShard("shard02/10.0.3.10:27018,10.0.3.11:27018,10.0.3.12:27018")
sh.addShard("shard03/10.0.4.10:27018,10.0.4.11:27018,10.0.4.12:27018")
Create geographic zones
Define zones that correspond to geographic regions and assign shards to zones.
sh.addShardToZone("shard01", "us-east")
sh.addShardToZone("shard02", "europe")
sh.addShardToZone("shard03", "asia-pacific")
Configure zone ranges
Define which data goes to which zone based on shard key values. This example uses country codes.
use myapp
sh.enableSharding("myapp")
// Configure zone ranges for user data based on country
sh.updateZoneKeyRange(
"myapp.users",
{ country: "AA" },
{ country: "MZ" },
"us-east"
)
sh.updateZoneKeyRange(
"myapp.users",
{ country: "NA" },
{ country: "SZ" },
"europe"
)
sh.updateZoneKeyRange(
"myapp.users",
{ country: "TA" },
{ country: "ZZ" },
"asia-pacific"
)
Enable sharding on collections
Enable sharding on collections and create shard keys that align with your zone strategy.
sh.shardCollection("myapp.users", { country: 1, userId: 1 })
sh.shardCollection("myapp.orders", { country: 1, orderId: 1 })
Configure balancer settings
Tune the balancer to respect zone boundaries and optimize data distribution.
use config
db.settings.updateOne(
{ _id: "balancer" },
{ $set: {
mode: "full",
activeWindow: {
start: "01:00",
stop: "05:00"
}
}},
{ upsert: true }
)
Create indexes for performance
Create indexes that support your shard key and common query patterns.
use myapp
db.users.createIndex({ country: 1, userId: 1 })
db.users.createIndex({ country: 1, email: 1 })
db.orders.createIndex({ country: 1, orderId: 1 })
db.orders.createIndex({ country: 1, customerId: 1 })
Verify your setup
Check cluster status and verify zone configuration is working correctly.
mongosh --port 27017
sh.status()
sh.getBalancerState()
db.adminCommand("listShards")
// Check zone configuration
use config
db.tags.find().pretty()
db.chunks.find({}, {shard: 1, min: 1, max: 1}).sort({min: 1})
sh.isBalancerRunning() and check chunk distribution regularly.Test geographic data distribution
Insert test data to verify documents are routed to correct geographic zones.
use myapp
// Insert users from different regions
db.users.insertMany([
{ userId: 1, country: "US", name: "John Doe", email: "john@example.com" },
{ userId: 2, country: "DE", name: "Anna Schmidt", email: "anna@example.com" },
{ userId: 3, country: "JP", name: "Tanaka Taro", email: "tanaka@example.com" }
])
// Check which shard each document landed on
db.users.find().explain("executionStats")
Monitor shard distribution
Set up monitoring to track data distribution and performance across zones. This helps with capacity planning and ensures optimal geographic distribution.
// Check data distribution per shard
db.runCommand({"dbStats": 1})
sh.status(true) // Detailed chunk information
// Monitor balancer activity
use config
db.actionlog.find({what: "balancer.round"}).sort({time: -1}).limit(5)
Common issues
| Symptom | Cause | Fix |
|---|---|---|
| Chunks not migrating to zones | Balancer disabled or zone ranges overlap | Check sh.getBalancerState() and verify zone ranges with db.tags.find() |
| Uneven data distribution | Poor shard key choice or skewed data | Monitor chunk distribution and consider compound shard keys |
| Config server connection errors | Network issues or config server down | Verify config server replica set health with rs.status() |
| Shard not accessible | Shard replica set issues | Check shard replica set status and connectivity from mongos |
| Slow queries across zones | Queries don't include shard key | Optimize queries to include shard key prefix for targeted routing |
Performance optimization
Fine-tune your sharded cluster for optimal performance across geographic zones. Consider implementing MongoDB monitoring with Prometheus and Grafana to track performance metrics.
Optimize read preferences
Configure read preferences to prefer local replicas in each geographic region.
// Java example
ReadPreference readPreference = ReadPreference.secondaryPreferred(
TagSet.builder().add("region", "us-east").build()
);
// Node.js example
const options = {
readPreference: 'secondaryPreferred',
readPreferenceTags: [{ region: 'us-east' }]
};
Configure write concerns
Set appropriate write concerns for each region to balance consistency and performance.
use myapp
db.users.createIndex(
{ country: 1, userId: 1 },
{
writeConcern: {
w: "majority",
j: true,
wtimeout: 5000
}
}
)
Backup and disaster recovery
Implement backup strategies that account for geographic distribution. Consider setting up automated backup procedures as described in our backup automation tutorial.
# Backup specific shard
mongodump --host shard01/10.0.2.10:27018 --out /backup/shard01
Backup config servers
mongodump --host configReplSet/10.0.1.10:27019 --out /backup/config
Next steps
- Set up MongoDB change streams for real-time applications
- Configure automated MongoDB backups
- Implement SSL encryption for replica sets
- Set up comprehensive MongoDB performance monitoring
- Harden MongoDB security for production environments
Running this in production?
Automated install script
Run this to automate the entire setup
#!/usr/bin/env bash
set -euo pipefail
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Global variables
MONGODB_VERSION="8.0"
CONFIG_DIR="/etc/mongod"
DATA_DIR="/var/lib/mongodb"
LOG_DIR="/var/log/mongodb"
MONGODB_USER="mongod"
REPLICA_SET_NAME="${1:-rs0}"
SHARD_NAME="${2:-shard01}"
# Usage function
usage() {
echo "Usage: $0 [replica_set_name] [shard_name]"
echo " replica_set_name: Name for the replica set (default: rs0)"
echo " shard_name: Name for the shard (default: shard01)"
echo ""
echo "Example: $0 configrs01 shard-us-east"
exit 1
}
# Cleanup function
cleanup() {
local exit_code=$?
if [ $exit_code -ne 0 ]; then
echo -e "${RED}[ERROR] Script failed. Cleaning up...${NC}"
systemctl stop mongod 2>/dev/null || true
systemctl disable mongod 2>/dev/null || true
fi
}
# Set trap for cleanup
trap cleanup ERR EXIT
# Check if running as root or with sudo
check_privileges() {
if [[ $EUID -ne 0 ]]; then
echo -e "${RED}[ERROR] This script must be run as root or with sudo${NC}"
exit 1
fi
}
# Detect distribution and set package manager
detect_distro() {
echo -e "${BLUE}[1/8] Detecting distribution...${NC}"
if [ -f /etc/os-release ]; then
. /etc/os-release
case "$ID" in
ubuntu|debian)
PKG_MGR="apt"
PKG_UPDATE="apt update"
PKG_INSTALL="apt install -y"
DISTRO_FAMILY="debian"
;;
almalinux|rocky|centos|rhel|ol)
PKG_MGR="dnf"
PKG_UPDATE="dnf check-update || true"
PKG_INSTALL="dnf install -y"
DISTRO_FAMILY="rhel"
;;
fedora)
PKG_MGR="dnf"
PKG_UPDATE="dnf check-update || true"
PKG_INSTALL="dnf install -y"
DISTRO_FAMILY="rhel"
;;
amzn)
PKG_MGR="yum"
PKG_UPDATE="yum check-update || true"
PKG_INSTALL="yum install -y"
DISTRO_FAMILY="rhel"
;;
*)
echo -e "${RED}[ERROR] Unsupported distribution: $ID${NC}"
exit 1
;;
esac
echo -e "${GREEN}Detected: $PRETTY_NAME ($DISTRO_FAMILY family)${NC}"
else
echo -e "${RED}[ERROR] Cannot detect distribution${NC}"
exit 1
fi
}
# Install prerequisites
install_prerequisites() {
echo -e "${BLUE}[2/8] Installing prerequisites...${NC}"
$PKG_UPDATE
if [ "$DISTRO_FAMILY" = "debian" ]; then
$PKG_INSTALL curl gnupg lsb-release ca-certificates
else
$PKG_INSTALL curl gnupg2 redhat-lsb-core ca-certificates
fi
echo -e "${GREEN}Prerequisites installed successfully${NC}"
}
# Configure MongoDB repository
configure_mongodb_repo() {
echo -e "${BLUE}[3/8] Configuring MongoDB repository...${NC}"
if [ "$DISTRO_FAMILY" = "debian" ]; then
# Add MongoDB GPG key
curl -fsSL https://www.mongodb.org/static/pgp/server-${MONGODB_VERSION}.asc | gpg --dearmor -o /usr/share/keyrings/mongodb-server-${MONGODB_VERSION}.gpg
# Detect Ubuntu version or use generic
if [ "$ID" = "ubuntu" ]; then
CODENAME=$(lsb_release -cs)
# Use jammy for newer versions if not specifically supported
case "$CODENAME" in
focal|jammy|noble) ;;
*) CODENAME="jammy" ;;
esac
else
# Debian
CODENAME="bookworm"
fi
echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-${MONGODB_VERSION}.gpg ] https://repo.mongodb.org/apt/ubuntu ${CODENAME}/mongodb-org/${MONGODB_VERSION} multiverse" > /etc/apt/sources.list.d/mongodb-org-${MONGODB_VERSION}.list
$PKG_UPDATE
else
# RHEL family
cat > /etc/yum.repos.d/mongodb-org-${MONGODB_VERSION}.repo << EOF
[mongodb-org-${MONGODB_VERSION}]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/\$releasever/mongodb-org/${MONGODB_VERSION}/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-${MONGODB_VERSION}.asc
EOF
fi
echo -e "${GREEN}MongoDB repository configured successfully${NC}"
}
# Install MongoDB
install_mongodb() {
echo -e "${BLUE}[4/8] Installing MongoDB...${NC}"
$PKG_INSTALL mongodb-org
# Prevent automatic upgrades
if [ "$DISTRO_FAMILY" = "debian" ]; then
echo "mongodb-org hold" | dpkg --set-selections
echo "mongodb-org-database hold" | dpkg --set-selections
echo "mongodb-org-server hold" | dpkg --set-selections
echo "mongodb-mongosh hold" | dpkg --set-selections
echo "mongodb-org-mongos hold" | dpkg --set-selections
echo "mongodb-org-tools hold" | dpkg --set-selections
else
# RHEL family - exclude from updates in yum.conf
if ! grep -q "exclude=mongodb-org" /etc/yum.conf; then
echo "exclude=mongodb-org,mongodb-org-database,mongodb-org-server,mongodb-mongosh,mongodb-org-mongos,mongodb-org-tools" >> /etc/yum.conf
fi
fi
echo -e "${GREEN}MongoDB installed successfully${NC}"
}
# Configure MongoDB
configure_mongodb() {
echo -e "${BLUE}[5/8] Configuring MongoDB for sharding...${NC}"
# Create directories with proper permissions
mkdir -p $DATA_DIR $LOG_DIR $CONFIG_DIR
chown -R $MONGODB_USER:$MONGODB_USER $DATA_DIR $LOG_DIR
chmod 755 $DATA_DIR $LOG_DIR
# Create MongoDB configuration
cat > /etc/mongod.conf << EOF
# mongod.conf for sharded cluster
# Storage configuration
storage:
dbPath: $DATA_DIR
journal:
enabled: true
wiredTiger:
engineConfig:
cacheSizeGB: 1
directoryForIndexes: true
# Logging configuration
systemLog:
destination: file
logAppend: true
path: $LOG_DIR/mongod.log
logRotate: reopen
# Network configuration
net:
port: 27017
bindIp: 127.0.0.1,$(hostname -I | awk '{print $1}')
# Security configuration
security:
authorization: enabled
# Replication configuration
replication:
replSetName: $REPLICA_SET_NAME
# Sharding configuration
sharding:
clusterRole: shardsvr
# Process management
processManagement:
fork: true
pidFilePath: /var/run/mongodb/mongod.pid
# Enable free monitoring
setParameter:
enableLocalhostAuthBypass: false
EOF
chmod 644 /etc/mongod.conf
chown root:root /etc/mongod.conf
echo -e "${GREEN}MongoDB configuration created${NC}"
}
# Configure firewall
configure_firewall() {
echo -e "${BLUE}[6/8] Configuring firewall...${NC}"
if command -v firewall-cmd &> /dev/null; then
# firewalld (RHEL family)
systemctl enable firewalld
systemctl start firewalld
firewall-cmd --permanent --add-port=27017/tcp
firewall-cmd --permanent --add-port=27018/tcp
firewall-cmd --permanent --add-port=27019/tcp
firewall-cmd --reload
elif command -v ufw &> /dev/null; then
# UFW (Ubuntu/Debian)
ufw --force enable
ufw allow 27017/tcp
ufw allow 27018/tcp
ufw allow 27019/tcp
else
echo -e "${YELLOW}[WARNING] No supported firewall found. Please configure firewall manually to allow ports 27017-27019${NC}"
fi
echo -e "${GREEN}Firewall configured for MongoDB ports${NC}"
}
# Configure SELinux if present
configure_selinux() {
if command -v getenforce &> /dev/null && [ "$(getenforce)" != "Disabled" ]; then
echo -e "${BLUE}[7/8] Configuring SELinux for MongoDB...${NC}"
# Install SELinux policy tools if not present
if [ "$DISTRO_FAMILY" = "rhel" ]; then
$PKG_INSTALL policycoreutils-python-utils 2>/dev/null || $PKG_INSTALL policycoreutils-python 2>/dev/null || true
fi
# Set SELinux contexts
semanage fcontext -a -t mongod_exec_t "/usr/bin/mongod" 2>/dev/null || true
semanage fcontext -a -t mongod_var_lib_t "$DATA_DIR(/.*)?" 2>/dev/null || true
semanage fcontext -a -t mongod_log_t "$LOG_DIR(/.*)?" 2>/dev/null || true
restorecon -R $DATA_DIR $LOG_DIR /usr/bin/mongod 2>/dev/null || true
echo -e "${GREEN}SELinux configured for MongoDB${NC}"
fi
}
# Start and enable MongoDB service
start_mongodb_service() {
echo -e "${BLUE}[8/8] Starting MongoDB service...${NC}"
systemctl daemon-reload
systemctl enable mongod
systemctl start mongod
# Wait for MongoDB to start
echo -n "Waiting for MongoDB to start"
for i in {1..30}; do
if systemctl is-active mongod &> /dev/null; then
echo ""
echo -e "${GREEN}MongoDB service started successfully${NC}"
return 0
fi
echo -n "."
sleep 2
done
echo ""
echo -e "${RED}[ERROR] MongoDB failed to start within 60 seconds${NC}"
systemctl status mongod
exit 1
}
# Verify installation
verify_installation() {
echo -e "${BLUE}Verifying MongoDB installation...${NC}"
# Check service status
if systemctl is-active mongod &> /dev/null; then
echo -e "${GREEN}✓ MongoDB service is running${NC}"
else
echo -e "${RED}✗ MongoDB service is not running${NC}"
exit 1
fi
# Check MongoDB connection
if mongosh --eval "db.adminCommand('ping')" --quiet &> /dev/null; then
echo -e "${GREEN}✓ MongoDB is accepting connections${NC}"
else
echo -e "${YELLOW}⚠ MongoDB connection test failed (normal for secured instances)${NC}"
fi
# Check configuration
if [ -f /etc/mongod.conf ]; then
echo -e "${GREEN}✓ Configuration file exists${NC}"
else
echo -e "${RED}✗ Configuration file missing${NC}"
exit 1
fi
echo ""
echo -e "${GREEN}=== MongoDB Sharding Installation Complete ===${NC}"
echo -e "${BLUE}Next steps:${NC}"
echo "1. Initialize replica set: mongosh --eval 'rs.initiate()'"
echo "2. Create admin user for authentication"
echo "3. Configure config servers and mongos routers"
echo "4. Add shards to the cluster"
echo "5. Configure zones and zone ranges for geographic distribution"
echo ""
echo -e "${YELLOW}Configuration file: /etc/mongod.conf${NC}"
echo -e "${YELLOW}Data directory: $DATA_DIR${NC}"
echo -e "${YELLOW}Log file: $LOG_DIR/mongod.log${NC}"
}
# Main execution
main() {
echo -e "${GREEN}=== MongoDB Sharding Installation Script ===${NC}"
# Validate arguments
if [ "$#" -gt 2 ]; then
usage
fi
check_privileges
detect_distro
install_prerequisites
configure_mongodb_repo
install_mongodb
configure_mongodb
configure_firewall
configure_selinux
start_mongodb_service
verify_installation
# Disable cleanup trap on success
trap - ERR EXIT
}
# Execute main function
main "$@"
Review the script before running. Execute with: bash install.sh