Dependent Origination

new ip’s for shard servers

Posted on: September 13, 2012

I did something stupid yesterday morning, apparently after only five or six hours’ sleep, that I terminated some EC2 instances that are still being used as shard servers — that is their ip’s are still stored in the main config database of our MongoDB setup. This has caused a major headache that took me an entire day to resolve.

First of all, restarting the instance with its ebs (luckily I did not delete the disk image 🙂 will not get back the original IP’s.

Apparently this post says you can go to the config database to change the ip addresses of the shard servers so when you run db.runCommand({listShards:1}) you will see shards with newly updated IP’s. Unfortunately it didn’t work for me because the config db (on the shard that is up and running) for some reason refused to show any existing collections (or data) to me. Not sure why. Otherwise from the post it sounded like the author went into those collections and modified them directly.

Other sites mentioned other ways to fix it — such as adding your new instance’s IP into the old replica set and then remove the old one — however, this requires that the old replica set is still running by this time. Since I terminated the instances and their ip changed, I had no way of restarting the replica set without mongo complaining about mismatched config, for example i cannot give a new name as another replica set  (it complains that the new name and old name mismatches), i cannot start it as the old name either (it complains that self is not in the replica set). Despite it being a replica set, with data directories only, apparently it stores some config information with itself.

So I had only one way out: I restarted the good shard as a single-sharded (one replica set) setup so production traffic can go ahead without any more write errors (since mongos is still distributing writes to the missing shard it has lots of write error — more on this later, in fact, i enabled sharding without actually carrying out the sharding command so there were no real data on the missing shard but still mongo refused to operate).

About the missing shard, it is really hard to start a fresh setup from their data directories (I originally planned to start new instances on them and use mongoexport to export the data and then mongoimport into the good shard). The plan never worked because, as described before, some config information was stored together with the data. Luckily after three hours, I figured out that I never really sharded the data so I don’t have to restore the data. Without this revelation, this would still be a disaster. But now it is merely a mishap 🙂

Here is another post I found useful while trying to modify IP’s in the existing config database. There are other suggestions in the thread, which was not applicable to my case but could be useful to yours.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

September 2012
« Aug   Oct »


Flickr Photos

%d bloggers like this: