M103: MongoDB Cluster Administration: Sharding

Replication(复制) vs Sharding(分片)

复制让多台服务器拥有同样的数据副本，每一台服务器都是其他服务器的镜像，而每一个分片都有其他分片拥有不同的数据子集。

分片的目标之一是创建一个拥有多个实例（或多台机器）的目标集群，整个集群对应用程序来说就像是一台单机服务器。

为了对应用程序隐藏数据库架构的细节，在分片之前要先执行mongos进行一次路由过程。

misc

insert some data

for (var i = 0; i< 100000; i++) {
  db.users.insert({"username": "user"+i, "created_at": new Date()});
}

What is Sharding?

A server in a replica set stores the whole database, it can not be vertically scaled forever.

Sharding is for horizontall scaling, just cuting big database into pieces. Replication saves the single point failure and guarantees high availability.

Sharding is horizontal scaling.

Sharding allows us to grow our dataset without worrying about being able to store it all on one server.

In MongoDB, scaling is done horizontally, which means instead of making the individual machines better, we just add more machines and then distribute the dataset among those machines.

That router process is called the Mongos. And clients connect to Mongos us instead of connecting to each shard individually. And we have any number of Mongos processes so we can service many different applications or requests to the same Sharded Cluster.

Shards: store distributed collections
Config Servers (Config Server Replica Set): store metadata about each shard
Mongos: routes queries to shards

When to Shard

stay tuned for that 请继续关注

Zone sharding(geographically)

We should consider sharding when:

Our organization outgrows the most powerful servers available, limiting our vertical scaling options. This is correct - sharding can provide an alternative to vertical scaling.

Government regulations require data to be located in a specific geography. This is correct - sharding allows us to store different pieces of data in specific countries or regions.

We are holding more than 5TB per server and operational costs increase dramatically. This is correct - generally, when our deployment reaches 2-5TB per server, we should consider sharding.

Sharding Architecture

Setting Up a Sharded Cluster

What we need to do?

build a CSRS (a config server replica set)
build mongos
connect mongos and config server with replica set together

If you’d like to deploy a sharded cluster on your machine, you can find the commands from the lecture here:

Configuration file for first config server csrs_1.conf:

sharding:
  clusterRole: configsvr
replication:
  replSetName: m103-csrs
security:
  keyFile: /var/mongodb/pki/m103-keyfile
net:
  bindIp: localhost,192.168.103.100
  port: 26001
systemLog:
  destination: file
  path: /var/mongodb/db/csrs1.log
  logAppend: true
processManagement:
  fork: true
storage:
  dbPath: /var/mongodb/db/csrs1

csrs_2.conf:

sharding:
  clusterRole: configsvr
replication:
  replSetName: m103-csrs
security:
  keyFile: /var/mongodb/pki/m103-keyfile
net:
  bindIp: localhost,192.168.103.100
  port: 26002
systemLog:
  destination: file
  path: /var/mongodb/db/csrs2.log
  logAppend: true
processManagement:
  fork: true
storage:
  dbPath: /var/mongodb/db/csrs2

csrs_3.conf:

sharding:
  clusterRole: configsvr
replication:
  replSetName: m103-csrs
security:
  keyFile: /var/mongodb/pki/m103-keyfile
net:
  bindIp: localhost,192.168.103.100
  port: 26003
systemLog:
  destination: file
  path: /var/mongodb/db/csrs3.log
  logAppend: true
processManagement:
  fork: true
storage:
  dbPath: /var/mongodb/db/csrs3

Starting the three config servers:

mongod -f csrs_1.conf
mongod -f csrs_2.conf
mongod -f csrs_3.conf

Connect to one of the config servers:

mongo --port 26001

Initiating the CSRS:

rs.initiate()

Creating super user on CSRS:

use admin
db.createUser({
  user: "m103-admin",
  pwd: "m103-pass",
  roles: [
    {role: "root", db: "admin"}
  ]
})

Authenticating as the super user:

db.auth("m103-admin", "m103-pass")

Add the second and third node to the CSRS:

rs.add("192.168.103.100:26002")
rs.add("192.168.103.100:26003")

Mongos config (mongos.conf):

sharding:
  configDB: m103-csrs/192.168.103.100:26001,192.168.103.100:26002,192.168.103.100:26003
security:
  keyFile: /var/mongodb/pki/m103-keyfile
net:
  bindIp: localhost,192.168.103.100
  port: 26000
systemLog:
  destination: file
  path: /var/mongodb/db/mongos.log
  logAppend: true
processManagement:
  fork: true

Start the mongos server:

mongos -f mongos.conf

Connect to mongos:

mongo --port 26000 --username m103-admin --password m103-pass --authenticationDatabase admin

Check sharding status:

MongoDB Enterprise mongos> sh.status()

Updated configuration for node1.conf:

sharding:
  clusterRole: shardsvr
storage:
  dbPath: /var/mongodb/db/node1
  wiredTiger:
    engineConfig:
      cacheSizeGB: .1
net:
  bindIp: 192.168.103.100,localhost
  port: 27011
security:
  keyFile: /var/mongodb/pki/m103-keyfile
systemLog:
  destination: file
  path: /var/mongodb/db/node1/mongod.log
  logAppend: true
processManagement:
  fork: true
replication:
  replSetName: m103-repl

Updated configuration for node2.conf:

sharding:
  clusterRole: shardsvr
storage:
  dbPath: /var/mongodb/db/node2
  wiredTiger:
    engineConfig:
      cacheSizeGB: .1
net:
  bindIp: 192.168.103.100,localhost
  port: 27012
security:
  keyFile: /var/mongodb/pki/m103-keyfile
systemLog:
  destination: file
  path: /var/mongodb/db/node2/mongod.log
  logAppend: true
processManagement:
  fork: true
replication:
  replSetName: m103-repl

Updated configuration for node3.conf:

sharding:
  clusterRole: shardsvr
storage:
  dbPath: /var/mongodb/db/node3
  wiredTiger:
    engineConfig:
      cacheSizeGB: .1
net:
  bindIp: 192.168.103.100,localhost
  port: 27013
security:
  keyFile: /var/mongodb/pki/m103-keyfile
systemLog:
  destination: file
  path: /var/mongodb/db/node3/mongod.log
  logAppend: true
processManagement:
  fork: true
replication:
  replSetName: m103-repl

Connecting directly to secondary node (note that if an election has taken place in your replica set, the specified node may have become primary):

mongo --port 27012 -u "m103-admin" -p "m103-pass" --authenticationDatabase "admin"

Shutting down node:

use admin
db.shutdownServer()

Restarting node with new configuration:

mongod -f node2.conf

Stepping down current primary:

rs.stepDown()

Adding new shard to cluster from mongos:

sh.addShard("m103-repl/192.168.103.100:27012")

Recap

Launching mongos and CSRS
Enabling sharding on a RS
- Rolling upgrade!
Adding shards to a cluster

Starting processes order

We can not start mongod before CSRS Servers and mongos are started.

start all CSRS server
start mongos
start mongod replica set

501  1938     1   0  9:46AM ??         0:07.58 mongod -f etc/node1.yaml
501  2043     1   0  9:49AM ??         0:04.83 mongod -f etc/node2.yaml
501  2085     1   0  9:49AM ??         0:02.83 mongod -f etc/csrs_1.yaml
501  2088     1   0  9:50AM ??         0:03.01 mongod -f etc/csrs_2.yaml
501  2091     1   0  9:50AM ??         0:02.92 mongod -f etc/csrs_3.yaml
501  2113     1   0  9:50AM ??         0:04.11 mongod -f etc/node3.yaml

Tips

Mongos inherits its users from the config servers.
The mongos configuration file needs to specify the config servers.
The mongos configuration file doesn’t need to have a dbpath.

Lab: Deploy a Sharded Cluster

Config DB

You should generally never write data to it.
connect to mongos

sh.status()

If you’d like to explore the collections on the config database, you can find the instructions here:

Switch to config DB:

use config

Query config.databases:

db.databases.find().pretty()

Query config.collections:

db.collections.find().pretty()

Query config.shards:

db.shards.find().pretty()

Query config.chunks:

db.chunks.find().pretty()

Query config.mongos:

db.mongos.find().pretty()

When should you manually write data to the Config DB? When directed to by MongoDB documentation or Support Engineers

Shard Keys

Shard key is an indexed field in a collection which you want to slice it into data pieces.

Shard key must be incluede in every collection.
Shard key fields must be indexed
Shard keys are immutable
Shard keys are permanent

If you’d like to shard a collection, you can find instructions to create a shard key here:

Show collections in m103 database:

use m103
show collections

Enable sharding on the m103 database:

sh.enableSharding("m103")

Find one document from the products collection, to help us choose a shard key:

db.products.findOne()

Create an index on sku:

db.products.createIndex( { "sku": 1 } )

Shard the products collection on sku:

sh.shardCollection( "m103.products", { "sku": 1 } )

Checking the status of the sharded cluster:

sh.status()

Recap

Shard Keys determine data distribution in a sharded cluster
Shard Keys are immutable
Shard Key Values are immutable
Create the underlying index first before sharding on the indexed field or fields
Shard keys are used to route queries to specific shards

Picking a Good Shard Key

Unsharding is hard.

cardinality 基数

Problem:

Which of the following are indicators that a field or fields are a good shard key choice?

High Cardinality
Low Frequency
Non-monotonic change

Hashed Shard Keys

Addtional type of shard key. Hashed shard keys, which underline index is hashed.

More even distributed data.

Lab: Shard a Collection

Import json data

mongoimport --port=26000 --host=localhost --authenticationDatabase=admin -u m103-admin -p m103-pass --db m103 --collection products --file /dataset/products.json

Connet to mongos

mongo --port 26000 --username m103-admin --password m103-pass --authenticationDatabase admin

Create an index on sku:

db.products.createIndex( { "sku": 1 } )

Shard the products collection on sku:

sh.shardCollection( "m103.products", { "sku": 1 } )

Checking the status of the sharded cluster:

sh.status()

Chunks

MongoDB uses the shard key associated to the collection to partition the data into chunks. A chunk consists of a subset of sharded data. Each chunk has a inclusive lower and exclusive upper range based on the shard key.

// connect to mongos
use config
db.chunks.findOne()

1MB <= chunks size <= 1024MB

Chunk ranges have an inclusive minimum and an exclusive maximum.

Balancing

Start the balancer:

sh.startBalancer(timeout, interval)

Stop the balancer:

sh.stopBalancer(timeout, interval)

Enable/disable the balancer:

sh.setBalancerState(boolean)

Queries in a Sharded Cluster

targeted vs scatter-gathered (集中收集或分散收集)

Targeted Queries vs Scatter Gather: Part 1

Targeted Queries vs Scatter Gather: Part 2

Show collections in the m103 database:

use m103
show collections

Targeted query with explain() output:

db.products.find({"sku" : 1000000749 }).explain()

Scatter gather query with explain() output:

db.products.find( {
  "name" : "Gods And Heroes: Rome Rising - Windows [Digital Download]" }
).explain()

Problem

Given a collection that is sharded on the following shard key:

{ "sku" : 1, "name" : 1 }

Which of the following queries results in a targeted query?

db.products.find( { "sku" : 1337, "name" : "MongoHacker" } )
db.products.find( { "name" : "MongoHacker", "sku" : 1337 } )

This is correct.

These two queries are actually identical, and can both be targeted using the shard key.

db.products.find( { "sku" : 1337 } )

This query includes the sku prefix and can therefore be targeted.

Lab: Detect Scatter Gather Queries

Which of the following is required in order for a query to be targeted to a subset of shards?

The query uses the shard key
An index exists on the shard key

in order for a query to be targeted to a subset of shards, the query must use the shard key. This is because the data itself is divided on the shard key, so without that parameter the server cannot locate data without doing a Scatter Gather query.

in order for a query to be targeted to a subset of shards, an index must exist on the shard key. This is required before the collection can be sharded.

Problems

Given the following replica set configuration:

conf = {
  "_id": "replset",
  "version": 1,
  "protocolVersion": 1,
  "members": [
    {
      "_id": 0,
      "host": "localhost:27017",
      "priority": 1,
      "votes": 1
    },
    {
      "_id": 1,
      "host": "localhost:27018",
      "priority": 1,
      "votes": 1
    },
    {
      "_id": 2,
      "host": "localhost:27019",
      "priority": 1,
      "votes": 1
    },
    {
      "_id": 3,
      "host": "localhost:27020",
      "priority": 0,
      "votes": 0,
      "slaveDelay": 3600
    }
  ]
}

It serves as a “hot” backup of data in case of accidental data loss on the other members, like a DBA accidentally dropping the database.

Given the following shard key:

{ "country": 1, "_id": 1 }

Which of the following queries will be routed (targeted)? Remember that queries may be routed to more than one shard.

The correct answers are:

db.customers.find({"country": "Norway", "_id": 54})

This specifies both indexes used in the shard key.

db.customers.find({"country": { $gte: "Portugal", $lte: "Spain" }})

This specifies a prefix of the indexes used in the shard key, “country”, and will be routed to shards containing the necessary information.

db.customers.find({"_id": 914, "country": "Sweden"})

Although the indexes are specified in reverse order, this is a routed query. Any document matching {"_id": 914, “country”: “Sweden”} must be identical to {“country”: “Sweden”, “_id”: 914}. The query planner will take advantage of this and reorder the fields.

The incorrect answer is:

db.customers.find({"_id": 455})

Because the neither a prefix nor the full shard key is provided, mongos has no way to determine how to appropriately route this query. Instead, it will send this query to all shards in the cluster in a scatter-gather operation.