![]() |
VOOZH | about |
Hashed sharding in MongoDB distributes data using the hashed value of a shard key to balance load and improve scalability.
Hashed sharding distributes data across a distributed database using the hashed value of a shard key to ensure balanced load and avoid hotspots.
Sharding on a single-field hashed index distributes documents across shards using the hashed value of one field, helping balance load and improve scalability, especially for write-heavy workloads.
Sharding on a compound hashed index distributes data using the combined hashed value of multiple fields, offering more flexible query support with balanced data distribution.
An example of implementing hashed sharding in MongoDB.
Before enabling sharding on a collection, ensure that the MongoDB deployment is configured for sharding.
# Enable sharding on the database
sh.enableSharding("mydatabase")
# Enable sharding on the collection with a specified shard key
sh.shardCollection("mydatabase.mycollection", { "myShardKeyField": "hashed" })
Insert data into the sharded collection. MongoDB will automatically distribute documents across shards based on the hashed shard key.
db.mycollection.insertOne({
"name": "John Doe",
"age": 30,
"myShardKeyField": "someValue"
})
Query data from the sharded collection. MongoDB will route queries to the appropriate shards based on the hashed shard key.
db.mycollection.find({ "myShardKeyField": "someValue" })Example: We have a sharded collection named "mycollection" with hashed sharding on the "myShardKeyField" field, querying the data will produce output similar to the following:
{
"_id": ObjectId("60f9d7ac345b7c9df348a86e"),
"name": "John Doe",
"age": 30,
"myShardKeyField": "someValue"
}
Hashed sharding offers several benefits:
| Hashed Sharding | Ranged Sharding |
|---|---|
| Uses a hash function on the shard key to evenly distribute data across shards. | Divides data into shards based on ranges of the shard key values. |
| Ensures uniform distribution and minimizes hotspot. | Can lead to uneven distribution if ranges are poorly chosen. |
| Efficient for point queries and high volume inserts. | Efficient for range queries that align with shard key ranges. |
Not suitable for range queries that span multiple shards (data is non-sequential). | Supports ordered and sequential data access within each shard. |
| Limited flexibility for range-based queries. | More flexible for range-based queries. |
| Simpler to implement and manage shard keys. | More complex to implement and manage shard ranges effectively. |
| Ideal for unpredictable access patterns and write-heavy workloads. | Suitable for applications with frequent range queries or ordered retrieval. |