Hashed sharding provides more even data distribution across the sharded cluster at the cost of reducing Targeted Operations vs. Broadcast Operations. Post-hash, documents with “close” shard key values are unlikely to be on the same chunk or shard - the
mongos is more likely to perform Broadcast Operations to fulfill a given ranged query.
mongos can target queries with equality matches to a single shard.
If you shard an empty collection using a hashed shard key, MongoDB automatically creates two empty chunks per shard, to cover the entire range of the hashed shard key value across the cluster. You can control how many chunks MongoDB creates with the
numInitialChunks parameter to
shardCollection or by manually creating chunks on the empty collection using the
MongoDB automatically computes the hashes when resolving queries using hashed indexes. Applications do not need to compute hashes.
The field you choose as your hashed shard key should have a good cardinality, or large number of different values. Hashed keys are ideal for shard keys with fields that change monotonically like ObjectId values or timestamps. A good example of this is the default
_id field, assuming it only contains ObjectID values.
To shard a collection using a hashed shard key, see Shard a Collection.
Given a collection using a monotonically increasing value
X as the shard key, using ranged sharding results in a distribution of incoming inserts similar to the following:
Since the value of
X is always increasing, the chunk with an upper bound of
maxKey receives the majority incoming writes. This restricts insert operations to the single shard containing this chunk, which reduces or removes the advantage of distributed writes in a sharded cluster.
By using a hashed index on
X, the distribution of inserts is similar to the following:
Since the data is now distributed more evenly, inserts are efficiently distributed throughout the cluster.
If you shard an empty collection using a hashed shard key, MongoDB automatically creates and migrates empty chunks so that each shard has two chunks. To control how many chunks MongoDB creates when sharding the collection, use
shardCollection with the
hashed indexes truncate floating point numbers to 64-bit integers before hashing. For example, a
hashed index would store the same value for a field that held a value of
2.9. To prevent collisions, do not use a
hashed index for floating point numbers that cannot be reliably converted to 64-bit integers (and then back to floating point). MongoDB
hashed indexes do not support floating point values larger than 253.
To learn how to deploy a sharded cluster and implement hashed sharding, see Deploy a Sharded Cluster.