The shard key determines the distribution of the collection’s documents among the cluster’s shards. The shard key is either an indexed field or indexed compound fields that exists in every document in the collection.
MongoDB attempts to distribute chunks evenly among the shards in the cluster. The shard key has a direct relationship to the effectiveness of chunk distribution. See Choosing a Shard Key.
Once you shard a collection, the shard key and the shard key values are immutable; i.e.
- You cannot select a different shard key for that collection.
- You cannot update the values of the shard key fields.
The choice of shard key affects how the sharded cluster balancer creates and distributes chunks across the available shards. This affects the overall efficiency and performance of operations within the sharded cluster.
The shard key affects the performance and efficiency of the sharding strategy used by the sharded cluster.
The ideal shard key allows MongoDB to distribute documents evenly throughout the cluster.
For restrictions on shard key, see Shard Key Limitations.
When sharding a collection that is not empty, the shard key can constrain the maximum supported collection size for the initial sharding operation only. See
Sharding Existing Collection Data Size.
A sharded collection can grow to any size after successful sharding.
A shard key on a value that increases or decreases monotonically is more likely to distribute inserts to a single shard within the cluster.
This occurs because every cluster has a chunk that captures a range with an upper bound of
maxKey always compares as higher than all other values. Similarly, there is a chunk that captures a range with a lower bound of
minKey always compares as lower than all other values.
If the shard key value is always increasing, all new inserts are routed to the chunk with
maxKey as the upper bound. If the shard key value is always decreasing, all new inserts are routed to the chunk with
minKey as the lower bound. The shard containing that chunk becomes the bottleneck for write operations.
The following image illustrates a sharded cluster using the field
X as the shard key. If the values for
X are monotonically increasing, the distribution of inserts may look similar to the following:
If the shard key value was monotonically decreasing, then all inserts would route to
Chunk A instead.
A shard key that does not change monotonically does not guarantee even distribution of data across the sharded cluster. The cardinality and frequency of the shard key also contributes to data distribution. Consider each factor when choosing a shard key.
If your data model requires sharding on a key that changes monotonically, consider using Hashed Sharding.