paint-brush
How to Stop Your MongoDB Cluster from Going Read-Only at the Worst Possible Timeby@puneetmlhtr
New Story

How to Stop Your MongoDB Cluster from Going Read-Only at the Worst Possible Time

by Puneet Malhotra5mMarch 25th, 2025
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

A knowledge of MongoDB’s election process and quorum requirements is important before choosing the best approach.
featured image - How to Stop Your MongoDB Cluster from Going Read-Only at the Worst Possible Time
Puneet Malhotra HackerNoon profile picture

MongoDB is a document DB that comes with great flexibility and scaling. It is built using a scale out architecture where more nodes can be added to meet increasing demand. Having said that, setting it on multiple regions comes with a challenge to keep it available and consistent when regions go down. As MongoDB requires majority votes to pick Primary, a bad setup can leave it in read only mode.


When I first worked on a project requiring a MongoDB deployment across multiple active-active regions, I thought it would be straightforward. But when I started going over the available options, I realized that knowledge of MongoDB’s election process and quorum requirements is important before choosing the best approach. In this article, I’m sharing my discoveries and the methods I investigated.

The Quorum Problem in Multi-Region MongoDB Deployments

MongoDB’s replica sets keep things running smoothly by using an election process that requires more than half of the voting members to pick a primary. If a primary cannot collect a majority of votes, it steps down, and no new primary can be elected, leaving the set read-only (all members become secondaries). This mechanism prevents split-brain scenarios with multiple primaries.


For example, in a 4-member set split 2–2 by a network partition, neither side reaches three votes (the majority of 4), resulting in no primary until the partition heals. Even-numbered clusters are not ideal because a perfect split can block write availability. The election protocol will re-run elections if a tie occurs, but repeated ties delay failover. The best practice is to use an odd number of voting nodes to eliminate this risk.

Consider a Deployment with:

  • 3 nodes in East (voting members)

  • 3 nodes in West (voting members)

  • 1 additional node in North (tie-breaker node)

    multi region cluster

If an entire region goes down, the remaining 3 nodes alone do not have a majority (they need 4 votes out of 7). A 7th node in a neutral region (North) act as a tie-breaker to maintain quorum. This node can either be an arbiter (vote-only) or a data-bearing node. Let’s examine the trade-offs.

Approach 1: Setup with an Arbiter

When setting up a MongoDB replica set, it is important to know how write concerns work. Especially when using an arbiter. In a three node configuration, there is one primary and one secondary, along with an arbiter. In this setup, the majority is equal to the number of data-bearing voting members. If you set the write concern to majority, it can lead to issues. If a data-bearing member goes down, writes might time out or never get acknowledged. To avoid this, it's generally better to use another data-bearing voting member instead of an arbiter.


Arbiters don’t store any data, so they don’t have an oplog. This means they can’t track the latest operations and compare them like other nodes. During elections, this can sometimes cause arbiter voting for a secondary with lower priority, which might not be ideal.


Replication

In larger setups, like three nodes in one, three in another, and a secondary in a third location. This kind of setup helps with redundancy and failover but also introduces considerations like latency and data consistency.


  • 3 nodes in West (voting)
  • 3 nodes in East (voting)
  • 1 arbiter in North (voting, no data)

Why It Works

  • If West fails, the 3 East nodes and arbiter can elect a new primary.
  • If East fails, the 3 West nodes and arbiter can elect a new primary.
  • Requires no extra storage since the arbiter doesn’t hold data.

Downsides

  • Arbiters don’t acknowledge writes, meaning that if a data-bearing secondary is also down, writes requiring majority can fail.

  • MongoDB discourages arbiters in production, as they add risk without redundancy.


Best for: When cost constraints prevent adding a full data-bearing node, but a third vote is needed for quorum.

Approach 2: 3-3-1 Setup with a Data-Bearing Tie-Breaker Node

  • 3 nodes in West (voting)
  • 3 nodes in East (voting)
  • 1 data-bearing secondary in North

Why It Works

  • If an entire region fails, the tie-breaker in third region can help elect majority.
  • The extra node stores data, reducing the risk of data loss.
  • Unlike an arbiter, this node can be queried.
  • MongoDB recommends this setup over using an arbiter.

Downsides

  • Higher storage and replication cost than an arbiter.


Best for: High availability where an extra backup and read scaling are required.


While arbiters can help with elections, they don’t contribute to data durability. Having odd number of data-bearing voting members is safer for maintaining a healthy and reliable replica set.

Alternative Approaches

Hidden Node Instead of a Tie-Breaker

A hidden node is a data-bearing secondary that doesn’t vote or get queried but provides a full backup. It acts as a recovery option and can be manually promoted if needed but it does not participate in automatic elections.

Sharded Cluster with Region-Aware Primaries

For true active-active writes, each region can be a primary for different subsets of data using sharded clusters with zone sharding. This eliminates cross-region replication lag. This approach requires application-level changes and increased complexity.

Clarifying Active-Active in MongoDB

MongoDB does not support multi-primary replication in a replica set. Only one primary exists at a time, and secondaries replicate from it. If a region wants its own primary, it must be configured via sharded clusters.

What to Remember:

  • MongoDB replica sets require a majority for elections.
  • Only one primary exists at a time.
  • Arbiters should only be used as a last resort.
  • A two-region deployment must include a third vote in a separate location.
  • For true active-active, use sharding.

Final Thoughts

If you can afford it, use a data-bearing node instead of an arbiter to maintain high availability and prevent write inconsistencies. Understanding above options helps in designing a resilient multi-region MongoDB deployment.


References:

https://www.mongodb.com/docs/manual/replication/

https://www.mongodb.com/docs/manual/sharding/

https://www.mongodb.com/docs/manual/core/replica-set-elections/

https://www.mongodb.com/docs/manual/core/replica-set-secondary/

https://www.mongodb.com/docs/manual/reference/write-concern/#calculating-majority-for-write-concern