How to Run MongoDB on EC2 Spot Instances - Spot.io

How to Run MongoDB on EC2 Spot Instances

Reading Time: 4 minutes

MongoDB is an open-source document database that provides high performance, high availability, and automatic scaling. MongoDB obviates the need for an Object Relational Mapping (ORM) to facilitate development.

MongoDB is currently listed as #5 most popular DBMS in the world just behind PostgreSQL, MSSQL, MySQL, and Oracle.

In this article, we will discuss how can we leverage Spotinst’s Elastigroup to reduce MongoDB compute costs by 80% without compromising availability, performance and user experience.

How MongoDB replica set fault tolerance works

In order to provide fault tolerance, MongoDB allows creating replica sets which are, essentially clusters of MongoDB servers that implements master-slave replication and automated failover.

In replica sets, all write operations go to the set’s primary, who records the operations on the primary’s operation log. Secondary members continuously replicate the operation log and apply the operations to themselves in an asynchronous process.

image3

If the primary instance fails, The replica set members are trying to form a majority, then elect a new primary for the set, Majority is mandatory for selecting a new primary to preserve the integrity of the replica set as a whole.

In cases that quorum is not satisfied (No majority), the data is still readable via the secondary instances but write operation cannot be made.

Fault Tolerance table

# Members # Required for majority # Fault Tolerance
3 2 1
5 3 2
6 4 2

Illustrations

image4

 Another example

image6

 

Please refer to MongoDB documentation to learn more about replica sets and how to configure it:

https://docs.mongodb.com/manual/replication/

https://docs.mongodb.com/manual/tutorial/deploy-replica-set/

The importance of instance distribution across availability zones (data centers)

In order to protect data in a scenario of availability zone failure, instances should be spread across multiple availability zones If possible, use an odd number of availability zones, and choose a distribution of members that maximizes the likelihood that even with a loss of an availability zone, the remaining replica set members can form a majority.

MongoDB and Spot Instances

Spot instances are not the first thing that comes to mind when designing a MongoDB cluster,
Their dynamic nature and the fact that they can be terminated within 2 minutes notice, makes it hard to maintain a stable environment – This where Elastigroup comes into play.

Elastigroup allows 100% availability of service on top of the EC Spot market, by choosing the right bid for the right spot, Historical and real-time data is analyzed to choose the Spot Instance who offer the combination of lowest price and highest longevity. Using a predictive algorithm, changes in the Spot Market are identified 15 minutes in advance and a Spot replacement is triggered seamlessly, without service interruption.
Elastigroup can also distribute instances across availability zones and instance types to mitigate issues related to network partitioning.

NOTE: When an election is needed until a new primary is elected, the replica set stops accepting requests. To minimize primary replacements, the best practice is to run at least 1 on-demand instance and set its priority for the primary election to 1000.

learn more about members priorities:

https://docs.mongodb.com/v3.0/tutorial/adjust-replica-set-member-priority/

Retaining the data volumes

When running a MongoDB cluster, it’s important to preserve the data, when the data is missing a “rebuild” process is initiated and the data is resynced in its whole. This process might take some time(depending on the volumes of data) and also can be quite expensive if the data is replicated across availability zones for example. During the time of the rebuild, the users might experience poor performances due to the reason that at least one of the read replicas it not queryable.

As part of the stateful feature, Elastigroup allows retaining the data volumes of the machine, Any EBS volume that is attached to the instance will be continuously snapshotted while the machine is running and will be used as the block device mapping configuration upon replacement.

image5image1

Configuration

Assuming you have pre-configured your replica sets, use the following steps to run your MongoDB cluster on Spotinst Elastigroup

  1. Log in to Spotinst console
  2. Click Elastigroups
  3. Create / Edit the relevant Elastigroup
  4. Under ‘Compute’ find and expand the ‘Stateful’ section
  5. Check the ‘Persist Root Volume’ option
    • Choose ‘Maintain Root Volume’
  6. Check the Persist Data Volumes option
    • Choose ‘Hot EBS Migration’
    • Add the volumes that you would like to be retained upon instance replacement
  7. Check the ‘Persist Private IP’ option

image2

 

I hope that you found this article useful,
In case of any questions or guidance needed, please contact our customer success team at CS@spotinst.com.