Bending the Oplog to Your Will

Part 3 of the replication internals series: three handy tricks.

DIY triggers
Using the oplog for crash recovery
Creating non-replicated collections

This is the third post in a three-part series on replication. See also parts 1 (replication internals) and 2 (getting to know your oplog).

DIY triggers

MongoDB has a type of query that behaves like the tail -f command: it shows you new data as it’s written to a collection. This is great for the oplog, where you want to see new records as they pop up and don’t want to query over and over.

If you want this type of ongoing query, MongoDB returns a tailable cursor. When this cursor gets to the end of the result set it will hang around and wait for more elements to be added to the collection. As they’re added, the cursor will return them. If no elements are added for a while, the cursor will time out and the client has to requery if they want more results.

Using your knowledge of the oplog’s format, you can use a tailable cursor to do a long pull for activities in a certain collection, of a certain type, at a certain time… almost any criteria you can imagine.

Using the oplog for crash recovery

Suppose your database goes down, but you have a fairly recent backup. You could put a backup into production, but it’ll be a bit behind. You can bring it up-to-date using your oplog entries.

If you use the trigger mechanism (described above) to capture the entire oplog and send it to a non-capped collection on another server, you can then use an oplog replayer to play the oplog over your dump, bringing it as up-to-date as possible.

Pick a time pre-dump and start replaying the oplog from there. It’s okay if you’re not sure exactly when the dump was taken because the oplog is idempotent: you can apply it to your data as many times as you want and your data will end up the same.

Also, warning: I haven’t tried out the oplog replayer I linked to, it’s just the first one I found. There are a few different ones out there and they’re pretty easy to write.

Creating non-replicated collections

The local database contains data that is local to a given server: it won’t be replicated anywhere. This is one reason why it holds all of the replication info.

local isn’t reserved for replication stuff: you can put your own data there, too. If you do a write in the local database and then check the oplog, you’ll notice that there’s no record of the write. The oplog doesn’t track changes to the local database, since they won’t be replicated.

And now I’m all replicated out. If you’re interested in learning more about replication, check out the core documentation on it. There’s also core documentation on tailable cursors and language-specific instructions in the driver documentation.

5 thoughts on “Bending the Oplog to Your Will”

Pingback: Comparing MongoDB and SQL Server Replication | Jeremiah Peschka
Tik Sathaporn says:

April 21, 2011 at 10:22 am

fdfs

LikeLike

Roly says:

December 29, 2011 at 3:22 pm

Question: how does this work with multiple shards where each shard is replicated? is there one oplog per shard? or one oplog for the whole configuration? would i be able to query the MongoS process for the oplog of the configuration?

LikeLike

1. Anonymous says:
  
  December 29, 2011 at 4:40 pm
  
  There is one oplog per shard. Each shard is a replica set that doesn’t “know” it’s a shard (okay, it does, but it’s really just a normal replica set). So, if you wanted, you could connect directly to the shard to mess with its oplog.
  
  Honestly, I’m not sure what mongos does if you try to use the local database. It might put it on the config servers or it might use one of the shards or it might give you an error… you could try it and report back!
  
  LikeLike
  
  1. Roly says:
    
    January 3, 2012 at 12:39 pm
    
    Thanks for your response. I tried it out, if you use the local db on MongoS, it returns an error saying that’s not allowed.
    
    LikeLike