Replication with Cloudant, Pt. 1

By Max Thayer

How do you keep databases in sync? Replication.

Or maybe you want to take a snapshot of a database? Replication.

Replication copies some or all documents from one database to another. Where these databases live is irrelevant: one, both, or neither might be on Cloudant. As long as they both follow CouchDB’s API (ex: Cloudant, PouchDB, TouchDB, CouchDB itself), then you’re good to go.

What is it useful for?

Replication is used to synchronize changes across databases. This is a common requirement in mobile applications, where users have spotty or slow internet access, making it necessary to cache data locally and work with local copies, which are later synchronized with a Cloudant database.

Say you have a mobile app that gets data from the server, and wants to keep it in sync with the server as often as possible. Because mobile connections are often slow or spotty, the app won’t always have access to the server, so you’ll want to store data locally until it can be synchronized via replication.

For storing CouchDB data locally in web apps, look into PouchDB. For iOS or Android apps, use TouchDB. They follow CouchDB’s API, but have the speed benefit of storing data locally so such apps can work offline.

Basic Usage

First, you make a POST request to [your_couchdb_uri]/[database]/_replicate with a JSON in the request body specifying what database you’d like to replicate (aka source), and where you’d like to send the replicated documents (aka target). Here’s an example:

HTTP request:

POST https://[username].cloudant.com/_replicate

You can authenticate your request using Basic Auth or cookie-based auth via _session.

Request body:

{
    "source": "https://[username]:[password]@[username].cloudant.com/othertestdb/",
    "target": "https://[username]:[password]@[username].cloudant.com/testdb/"
}

If you get a 200 response, it worked! The response body should contain some stats about the replication -- how many docs were read and copied, any failures that occurred, etc.

This kind of one-off replication is great for taking snapshots of your database.

Continuous Replication

But say you want to keep the source and target permanently in sync, as in our mobile app scenario. To do this, we set the continuous field in our request body, like so:

{
    "continuous": true,
    "source": "https://[username]:[password]@[username].cloudant.com/othertestdb/",
    "target": "https://[username]:[password]@[username].cloudant.com/testdb/"
}

This kicks off a process that copies any changes in source to target as they occur by listening to the source DB's _changes endpoint.

If the connection goes down, the replicator process tries again a finite number of times before giving up. After the first failure, it waits 2.5 seconds to try again, 5 seconds after the second failure, 10 after the third, so on and so forth, to a maximum of 5 minutes between retries. Ten attempts (the default) gives your remote user a little over thirty minutes to regain connectivity. If you want to extend that window, set it in the request body:

{
    "continuous": true,
    "retries_per_request": 200,
    "source": "https://[username]:[password]@[username].cloudant.com/othertestdb/",
    "target": "https://[username]:[password]@[username].cloudant.com/testdb/"
}

That extends the window to regain connectivity to a little over sixteen hours.

More to Come

This is only the beginning of what you can do with replication. In future posts, I'll discuss monitoring, cancelling, and configuring replications on the fly.

Happy coding!

Sign Up for Updates!

Recent Posts