Configure MongoDB sharding with zone-based data distribution for geographic workloads

Advanced 45 min May 30, 2026 78 views
Ubuntu 24.04 Debian 12 AlmaLinux 9 Rocky Linux 9

Set up MongoDB sharding with geographic zones to distribute data based on location, ensuring optimal performance for global applications and regulatory compliance.

Prerequisites

  • At least 9 servers (3 config, 6+ shard members)
  • MongoDB 8.0 or higher
  • Network connectivity between all servers
  • Basic understanding of MongoDB concepts

What this solves

MongoDB sharding distributes data across multiple servers to handle large datasets and high throughput. Zone-based sharding adds geographic awareness, placing data in specific regions based on configurable rules. This ensures low latency for users, meets data sovereignty requirements, and optimizes resource utilization for global applications.

Step-by-step configuration

Install MongoDB on all servers

Install MongoDB Community Server on the servers that will host config servers, mongos routers, and shard members.

curl -fsSL https://www.mongodb.org/static/pgp/server-8.0.asc | sudo gpg --dearmor -o /usr/share/keyrings/mongodb-server-8.0.gpg
echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-8.0.gpg ] https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/8.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-8.0.list
sudo apt update
sudo apt install -y mongodb-org
cat > /etc/yum.repos.d/mongodb-org-8.0.repo << EOF
[mongodb-org-8.0]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/9/mongodb-org/8.0/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-8.0.asc
EOF
sudo dnf install -y mongodb-org

Configure config servers

Set up three config servers for metadata storage. Config servers store cluster metadata and shard configuration.

systemLog:
  destination: file
  path: /var/log/mongodb/mongod-config.log
  logAppend: true
storage:
  dbPath: /var/lib/mongodb-config
  journal:
    enabled: true
processManagement:
  fork: true
  pidFilePath: /var/run/mongodb/mongod-config.pid
net:
  port: 27019
  bindIp: 0.0.0.0
replication:
  replSetName: configReplSet
sharding:
  clusterRole: configsvr
sudo mkdir -p /var/lib/mongodb-config
sudo chown mongodb:mongodb /var/lib/mongodb-config
sudo systemctl start mongod -f /etc/mongod-config.conf

Initialize config server replica set

Connect to one config server and initialize the replica set with all three config servers.

mongosh --port 27019
rs.initiate({
  _id: "configReplSet",
  configsvr: true,
  members: [
    { _id: 0, host: "10.0.1.10:27019" },
    { _id: 1, host: "10.0.1.11:27019" },
    { _id: 2, host: "10.0.1.12:27019" }
  ]
})

Configure shard servers

Set up shard servers in different geographic regions. Each shard should be a replica set for high availability.

systemLog:
  destination: file
  path: /var/log/mongodb/mongod-shard.log
  logAppend: true
storage:
  dbPath: /var/lib/mongodb-shard
  journal:
    enabled: true
processManagement:
  fork: true
  pidFilePath: /var/run/mongodb/mongod-shard.pid
net:
  port: 27018
  bindIp: 0.0.0.0
replication:
  replSetName: shard01
sharding:
  clusterRole: shardsvr
sudo mkdir -p /var/lib/mongodb-shard
sudo chown mongodb:mongodb /var/lib/mongodb-shard
sudo systemctl start mongod -f /etc/mongod-shard.conf

Initialize shard replica sets

Initialize replica sets for each geographic region. Repeat this for each shard in different regions.

mongosh --port 27018
rs.initiate({
  _id: "shard01",
  members: [
    { _id: 0, host: "10.0.2.10:27018" },
    { _id: 1, host: "10.0.2.11:27018" },
    { _id: 2, host: "10.0.2.12:27018" }
  ]
})

Configure mongos routers

Set up mongos routers that client applications will connect to. These route queries to appropriate shards.

systemLog:
  destination: file
  path: /var/log/mongodb/mongos.log
  logAppend: true
processManagement:
  fork: true
  pidFilePath: /var/run/mongodb/mongos.pid
net:
  port: 27017
  bindIp: 0.0.0.0
sharding:
  configDB: configReplSet/10.0.1.10:27019,10.0.1.11:27019,10.0.1.12:27019
sudo mongos -f /etc/mongos.conf

Add shards to cluster

Connect to mongos and add each shard replica set to the cluster.

mongosh --port 27017
sh.addShard("shard01/10.0.2.10:27018,10.0.2.11:27018,10.0.2.12:27018")
sh.addShard("shard02/10.0.3.10:27018,10.0.3.11:27018,10.0.3.12:27018")
sh.addShard("shard03/10.0.4.10:27018,10.0.4.11:27018,10.0.4.12:27018")

Create geographic zones

Define zones that correspond to geographic regions and assign shards to zones.

sh.addShardToZone("shard01", "us-east")
sh.addShardToZone("shard02", "europe")
sh.addShardToZone("shard03", "asia-pacific")

Configure zone ranges

Define which data goes to which zone based on shard key values. This example uses country codes.

use myapp
sh.enableSharding("myapp")

// Configure zone ranges for user data based on country
sh.updateZoneKeyRange(
  "myapp.users",
  { country: "AA" },
  { country: "MZ" },
  "us-east"
)

sh.updateZoneKeyRange(
  "myapp.users",
  { country: "NA" },
  { country: "SZ" },
  "europe"
)

sh.updateZoneKeyRange(
  "myapp.users",
  { country: "TA" },
  { country: "ZZ" },
  "asia-pacific"
)

Enable sharding on collections

Enable sharding on collections and create shard keys that align with your zone strategy.

sh.shardCollection("myapp.users", { country: 1, userId: 1 })
sh.shardCollection("myapp.orders", { country: 1, orderId: 1 })

Configure balancer settings

Tune the balancer to respect zone boundaries and optimize data distribution.

use config
db.settings.updateOne(
  { _id: "balancer" },
  { $set: { 
    mode: "full",
    activeWindow: {
      start: "01:00",
      stop: "05:00"
    }
  }},
  { upsert: true }
)

Create indexes for performance

Create indexes that support your shard key and common query patterns.

use myapp
db.users.createIndex({ country: 1, userId: 1 })
db.users.createIndex({ country: 1, email: 1 })
db.orders.createIndex({ country: 1, orderId: 1 })
db.orders.createIndex({ country: 1, customerId: 1 })

Verify your setup

Check cluster status and verify zone configuration is working correctly.

mongosh --port 27017
sh.status()
sh.getBalancerState()
db.adminCommand("listShards")

// Check zone configuration
use config
db.tags.find().pretty()
db.chunks.find({}, {shard: 1, min: 1, max: 1}).sort({min: 1})
Note: The balancer may take time to migrate existing chunks to appropriate zones. Monitor with sh.isBalancerRunning() and check chunk distribution regularly.

Test geographic data distribution

Insert test data to verify documents are routed to correct geographic zones.

use myapp

// Insert users from different regions
db.users.insertMany([
  { userId: 1, country: "US", name: "John Doe", email: "john@example.com" },
  { userId: 2, country: "DE", name: "Anna Schmidt", email: "anna@example.com" },
  { userId: 3, country: "JP", name: "Tanaka Taro", email: "tanaka@example.com" }
])

// Check which shard each document landed on
db.users.find().explain("executionStats")

Monitor shard distribution

Set up monitoring to track data distribution and performance across zones. This helps with capacity planning and ensures optimal geographic distribution.

// Check data distribution per shard
db.runCommand({"dbStats": 1})
sh.status(true)  // Detailed chunk information

// Monitor balancer activity
use config
db.actionlog.find({what: "balancer.round"}).sort({time: -1}).limit(5)
Warning: Changing zone ranges after data exists requires the balancer to migrate chunks, which can impact performance. Plan zone boundaries carefully based on your data distribution patterns.

Common issues

SymptomCauseFix
Chunks not migrating to zonesBalancer disabled or zone ranges overlapCheck sh.getBalancerState() and verify zone ranges with db.tags.find()
Uneven data distributionPoor shard key choice or skewed dataMonitor chunk distribution and consider compound shard keys
Config server connection errorsNetwork issues or config server downVerify config server replica set health with rs.status()
Shard not accessibleShard replica set issuesCheck shard replica set status and connectivity from mongos
Slow queries across zonesQueries don't include shard keyOptimize queries to include shard key prefix for targeted routing

Performance optimization

Fine-tune your sharded cluster for optimal performance across geographic zones. Consider implementing MongoDB monitoring with Prometheus and Grafana to track performance metrics.

Optimize read preferences

Configure read preferences to prefer local replicas in each geographic region.

// Java example
ReadPreference readPreference = ReadPreference.secondaryPreferred(
    TagSet.builder().add("region", "us-east").build()
);

// Node.js example
const options = {
  readPreference: 'secondaryPreferred',
  readPreferenceTags: [{ region: 'us-east' }]
};

Configure write concerns

Set appropriate write concerns for each region to balance consistency and performance.

use myapp
db.users.createIndex(
  { country: 1, userId: 1 },
  { 
    writeConcern: { 
      w: "majority", 
      j: true,
      wtimeout: 5000
    }
  }
)

Backup and disaster recovery

Implement backup strategies that account for geographic distribution. Consider setting up automated backup procedures as described in our backup automation tutorial.

# Backup specific shard
mongodump --host shard01/10.0.2.10:27018 --out /backup/shard01

Backup config servers

mongodump --host configReplSet/10.0.1.10:27019 --out /backup/config

Next steps

Running this in production?

Want this managed for you? Running MongoDB sharding at scale adds a second layer of work: capacity planning, failover drills, cost control, and on-call expertise. See how we run infrastructure like this for European SaaS and fintech teams.

Automated install script

Run this to automate the entire setup

Need help?

Don't want to manage this yourself?

We handle high availability infrastructure for businesses that depend on uptime. From initial setup to ongoing operations.