On this page
Distributed Local Writes for Insert Only Workloads
MongoDB Tag Aware Sharding allows administrators to control data distribution in a sharded cluster by defining ranges of the shard key and tagging them to one or more shards.
This tutorial uses Zones along with a multi-datacenter sharded cluster deployment and application-side logic to support distributed local writes, as well as high write availability in the event of a replica set election or datacenter failure.
Important
The concepts discussed in this tutorial require a specific deployment architecture, as well as application-level logic.
These concepts require familiarity with MongoDB sharded clusters, replica sets, and the general behavior of zones.
This tutorial assumes an insert-only or insert-intensive workload. The concepts and strategies discussed in this tutorial are not well suited for use cases that require fast reads or updates.
Scenario
Consider an insert-intensive application, where reads are infrequent and low priority compared to writes. The application writes documents to a sharded collection, and requires near-constant uptime from the database to support its SLAs or SLOs.
The following represents a partial view of the format of documents the application writes to the database:
{
"_id" : ObjectId("56f08c447fe58b2e96f595fa"),
"message_id" : 329620,
"datacenter" : "alfa",
"userid" : 123,
...
}
{
"_id" : ObjectId("56f08c447fe58b2e96f595fb"),
"message_id" : 578494,
"datacenter" : "bravo",
"userid" : 456,
...
}
{
"_id" : ObjectId("56f08c447fe58b2e96f595fc"),
"message_id" : 689979,
"datacenter" : "bravo",
"userid" : 789,
...
}
Architecture
The deployment consists of two datacenters, alfa
and bravo
. There are two shards, shard0000
and shard0001
. Each shard is a replica set with three members. shard0000
has two members on alfa
and one priority 0 member on bravo
. shard0001
has two members on bravo
and one priority 0 member on alfa
.
Write Operations
If an inserted or updated document matches a configured tag range, it can only be written to a shard with the related tag.
MongoDB can write documents that do not match a configured tag range to any shard in the cluster.
Note
The behavior described above requires the cluster to be in a steady state with no chunks violating a configured tag range. See the following section on the balancer for more information.
Balancer
The balancer migrates the tagged chunks to the appropriate shard. Until the migration, shards may contain chunks that violate configured tag ranges and tags. Once balancing completes, shards should only contain chunks whose ranges do not violate its assigned tags and tag ranges.
Adding or removing tags or tag ranges can result in chunk migrations. Depending on the size of your data set and the number of chunks a tag range affects, these migrations may impact cluster performance. Consider running your balancer during specific scheduled windows. See Schedule the Balancing Window for a tutorial on how to set a scheduling window.
Application Behavior
By default, the application writes to the nearest datacenter. If the local datacenter is down, or if writes to that datacenter are not acknowledged within a set time period, the application switches to the other available datacenter by changing the value of the datacenter
field before attempting to write the document to the database.
The application supports write timeouts. The application uses Write Concern to set a timeout for each write operation.
If the application encounters a write or timeout error, it modifies the datacenter
field in each document and performs the write. This routes the document to the other datacenter. If both datacenters are down, then writes cannot succeed. See Resolve Write Failure.
The application periodically checks connectivity to any data centers marked as “down”. If connectivity is restored, the application can continue performing normal write operations.
Given the switching logic, as well as any load balancers or similar mechanisms in place to handle client traffic between datacenters, the application cannot predict which of the two datacenters a given document was written to. To ensure that no documents are missed as a part of read operations, the application must perform broadcast queries by not including the datacenter
field as a part of any query.
The application performs reads using a read preference of nearest
to reduce latency.
It is possible for a write operation to succeed despite a reported timeout error. The application responds to the error by attempting to re-write the document to the other datacenter - this can result in a document being duplicated across both datacenters. The application resolves duplicates as a part of the read logic.
Switching Logic
The application has logic to switch datacenters if one or more writes fail, or if writes are not acknowledged within a set time period. The application modifies the datacenter
field based on the target datacenter’s tag to direct the document towards that datacenter.
For example, an application attempting to write to the alfa
datacenter might follow this general procedure:
- Attempt to write document, specifying
datacenter : alfa
. - On write timeout or error, log
alfa
as momentarily down. - Attempt to write same document, modifying
datacenter : bravo
. - On write timeout or error, log
bravo
as momentarily down. - If both
alfa
andbravo
are down, log and report errors.