Navigation

FAQ: Backup and Restore

This addresses common questions about Cloud Manager and how it backs up and restores databases and collections.

Cloud Manager creates backups of MongoDB replica sets and sharded clusters. After an initial sync, Cloud Manager tails the operation log (oplog) to provide a continuous backup with point-in-time recovery of replica sets and consistent snapshots of sharded clusters. For more information, please review these frequently asked questions.

Requirements

What version of MongoDB does the Backup feature require?

For information on compatibility, see MongoDB Compatibility Matrix.

What MongoDB permissions does the Backup Agent require?

If you are backing up a MongoDB instance that has authentication enabled, the Backup Agent requires elevated privileges, as described in Required Access for Backup Agent.

Does Backup work with all types of deployments?

No. Backup does not currently support standalone deployments. Backup has full support for replica sets and sharded clusters.

Why does the Backup feature not support standalone deployments?

After an initial sync of your data, Cloud Manager copies data from the oplog to provide a continuous backup with point-in-time recovery. Cloud Manager does not support backup of standalone hosts because they do not have an oplog. To support backup with a single mongod instance, you can run a one-member replica set.

How Does Cloud Manager Measure Data Size?

Cloud Manager uses the following conversions to measure snapshot size and to measure how much oplog data has been processed:

  • 1 MB = 10242 bytes
  • 1 GB = 10243 bytes
  • 1 TB = 10244 bytes

Operations

How does the Backup Feature work?

You install the Backup Agent on a server in the same deployment with your MongoDB infrastructure. The agent conducts an initial sync of your data to Cloud Manager. After the initial sync, the agent tails the oplog to provide a continuous backup of your deployment.

Where should I run the Backup Agent?

Run the Backup Agent on a host that:

  • Is separate from your MongoDB instances. This avoids system resource contention.
  • Can connect to your MongoDB instances. Check network settings for connections between the agent and MongoDB hosts. For a list of needed ports, see open ports for agents.
  • Has at least 2 CPU cores and 3 GB of RAM above platform requirements. With each backup job it runs, the Backup Agent further impacts host performance.

Can I run the Backup and Monitoring Agents on a Single System?

There is no technical restriction that prevents the Backup Agent and the Monitoring Agent from running on a single system or host. However, both agents have resource requirements, and running both on a single system can affect the ability of these agents to support your deployment in Cloud Manager.

The resources required by the Backup Agent depend on rate and size of new oplog entries (i.e. total oplog gigabyte/hour produced.) The resources that the Monitoring Agent requires depends on the number of monitored mongod instances and the total number of databases provided by the mongod instances.

Can I run multiple Backup Agents to achieve high availability?

You can run multiple Backup Agents for high availability. If you do, the Backup Agents must run on different hosts.

When you run multiple Backup Agents, only one agent per project is the primary agent. The primary agent performs the backups. The remaining agents are completely idle, except to log their status as standbys and to periodically ask Cloud Manager whether they should become the primary.

Does the Backup Agent modify my database?

The Backup Agent writes a small token called a “checkpoint” into the oplog of the source database at a regular interval. These tokens provide a heartbeat for backups and have no effect on the source deployment. Each token is less than 100 bytes. See: Checkpoints for more information about checkpoints.

Will Backup impact my production databases?

The Backup feature will typically have minimal impact on production MongoDB deployments. This impact will be similar to that of adding a new secondary to a replica set.

By default, the Backup Agent will perform its initial sync, the most resource intensive operation for backups, against a secondary member of the replica set to limit its impact. You may optionally configure the Backup Agent to perform the initial sync against the replica set’s primary, although this will increase the impact of the initial sync operation.

Is my data safe?

Yes, Cloud Manager uses enterprise-grade hardware co-located in secure data centers to store all user data. The Backup Agent transmits all data using SSL. The data is not encrypted at rest. Cloud Manager requires two-factor authentication to provide any data for restores.

Is there a limit to Backup size?

There is currently no limit on the total size of snapshot storage. Backup works best for deployments whose total size is less than 2 TB.

If you wish to use the Backup feature for a larger deployment, please contact us for more information.

What is the load on the database during the initial Backup sync?

The impact of the initial backup synchronization should be similar to syncing a new secondary replica set member. The Backup Agent does not throttle its activity, and attempts to perform the sync as quickly as possible.

How do I perform maintenance on a Replica Set with Backup enabled?

Most operations in a replica set are replicated via the oplog and are thus captured by the backup process. Some operations, however, make changes that are not replicated: for these operations you must have Cloud Manager resync from your current replica set to include the changes.

The following operations are not replicated and therefore require resync:

  • Renaming or deleting a database by deleting the data files in the data directory. As an alternative, remove databases using an operation that MongoDB will replicate, such as db.dropDatabase() from the mongo shell.

  • Changing any data while the instance is running as a standalone.

  • Rolling index builds.

  • Using compact or repairDatabase to reclaim a significant amount of space.

    Resync is not strictly necessary after compact or repairDatabase operations but will ensure that the Cloud Manager copy of the data is resized, which means quicker restores and lower cost.

Does the Backup Agent Support SSL?

The Backup Agent always connects to the Cloud Manager servers using an SSL (HTTPS) connection.

The Backup Agent can connect to replica sets and shared clusters configured with SSL. See the Configure Backup Agent for SSL documentation for more information.

Configuration

How can I prevent Cloud Manager from backing up a collection?

Cloud Manager provides a namespaces filter that allows you to specify which collections or databases to back up.

How can I change which namespaces are backed up?

To edit the filter, see Edit a Backup’s Settings. Changing the namespaces filter might necessitate a resync. If so, Cloud Manager handles the resync.

Restoration

Cloud Manager produces a copy of your data files that you can use to seed a new deployment.

How does Cloud Manager provide point-in-time restores?

Cloud Manager first creates a local restore of a snapshot preceding the point in time and downloads it to your target host. The MongoDB Backup Restore Utility running on that host then downloads and applies oplog entries to reach the specified point in time.

Cloud Manager can build a restore to any point in time within a 24-hour period by replaying the oplog to the desired time.

To learn how to restore replica sets and sharded clusters, see Restore MongoDB Deployments

Can I take snapshots more frequently than every 6 hours?

No. Cloud Manager does not support a snapshot schedule more frequent than every 6 hours. For more information, see Snapshot Frequency and Retention Policy.

Can I set my own snapshot retention policy?

Yes. You can change the schedule through the Edit Snapshot Schedule menu option for a backed-up deployment. Administrators can change the snapshot frequency and retention policy through the snapshotSchedule resource in the API.

Customizing snapshot frequency and retention policies give you greater control over your backup costs.

How many copies of my data does Cloud Manager store?

Although we only charge you for only one copy of the data, Cloud Manager stores at least 3 copies of your data in at least 2 geographic locations to ensure redundancy.

How long does it take to create a restore?

Cloud Manager transmits all backups in a compressed form from the Cloud Manager server to your infrastructure.

Within the US, Cloud Manager sends snapshots at 50-100 Mbps. Assuming a compression factor of 4x and transmission speeds of 50 Mbps, a 250 GB snapshot will take 2.5 hours.

In addition, point-in-time restores depend upon the amount the oplog entries that your host must apply to the received snapshot to roll forward to the requested point-in-time of the backup.

Does the Backup feature perform any data validation?

Backup conducts basic corruption checks and provides an alert if any component (e.g. the agent) is down or broken, but does not perform explicit data validation. When it detects corruption, Cloud Manager errs on the side of caution and invalidates the current backup and sends an alert.

How do I restore a snapshot?

You can request a restore via Cloud Manager, where you can then choose which snapshot to restore and how you want Cloud Manager to deliver the restore. All restores require 2-factor authentication. If you have SMS set up, Cloud Manager will send an authorization code via SMS. You must enter the authorization code into the backup interface to begin the restore process.

Note

From India, use Google Authenticator for two-factor authentication. Google Authenticator is more reliable than authentication with SMS text messages to Indian mobile phone numbers (i.e. country code 91).

What is delivered when I restore a snapshot?

Cloud Manager delivers restores as tar.gz archives of MongoDB data files.

For more information, see Restore MongoDB Deployments.

How does Cloud Manager handle a rollback of backed-up data?

If your MongoDB deployment experiences a rollback, then Cloud Manager also rolls back the backed-up data.

Cloud Manager detects the rollback when a tailing cursor finds a mismatch in timestamps or hashes of write operations. Cloud Manager enters a rollback state and tests three points in the oplog of your replica set’s primary to locate a common point in history. Cloud Manager rollback differs from MongoDB secondary rollback in that the common point does not necessarily have to be the most recent common point.

When Cloud Manager finds a common point, the service invalidates oplog entries and snapshots beyond that point and rolls back to the most recent snapshot before the common point. Cloud Manager then resumes normal backup operations.

If Cloud Manager cannot find a common point, a resync is required.

What conditions will require a resync?

If the Backup Agent’s tailing cursor cannot keep up with your deployment’s oplog, then you must resync your backups.

This scenario might occur if:

  • Your application periodically generates a lot of data, shrinking the primary’s oplog window to the point that data is written to the oplog faster than Cloud Manager can consume it.
  • If the Backup Agent is running on an under-provisioned or over-used machine and cannot keep up with the oplog activity.
  • If the Backup Agent is down for a period of time longer than the oplog size allows. If you bring down your agents, such as for maintenance, restart them in a timely manner. For more information on oplog size, see Replica Set Oplog in the MongoDB manual.
  • If you delete all replica set data and deploy a new replica set with the same name, as might happen in a test environment where deployments are regularly torn down and rebuilt.
  • If there is a rollback, and Cloud Manager cannot find a common point in the oplog.
  • If an oplog event tries to update a document that does not exist in the backup of the replica set, as might happen if syncing from a secondary that has inconsistent data with respect to the primary.

How much does it cost to use Cloud Manager Backup?

The pricing for Cloud Manager Backup is based on snapshot size, schedule, and retention policy. See Backup Costs.