Dependent Origination

cannot start up mongos process

Posted on: July 4, 2012

The power outage of Amazon last weekend put all our machines out and by Saturday afternoon all was fine except our database instance, which was never recovered in the end😦 I thought our MongoDB was working on Saturday but only discovered on Monday that a weird mongod process was running and I couldn’t start up our usual setup with config servers, shard servers and the mongos process. mongos would always complain about ‘cannot upgrade from 3 to 2’ and just quit.

It puzzled me for quite some hours — we have tried many things trying to identify what is the root cause. At first I thought the process cannot find ‘localhost’ but telnet localhost worked fine. Then we digged into those logs trying to get more useful error messages. I started the config servers and mongos on a different machine talking to the same shard server and that worked fine so I felt like the machine itself was the problem after reboot. Finally googling the error message solidified my suspicion that the package I have installed onto the machine might undergo an incomplete update or something so I removed the package and used the binaries downloaded from mongo’s own website directly. Everything works fine since. Phew.

This page has a complete list of manipulating packages.

Here is a direct command line on how to remove an installed package using apt-get.

apt-cache search SearchTerm [search for a package from the source depot] which isn’t very useful in our case but listed here for sake of completeness.

Note: the removed package is the mongodb package installed from apt-get. The mongo site has instructions installing mongodb-10gen, which I haven’t tried so I don’t know if mongodb-10gen is a better package.

The benefits of installing a package is (1) easy removal; (2) they install mongo as a service for you. Installing mongo as a service has the automatic restart coming for free — you can edit a bunch of configuration files so the mongod’s, mongos’s will start up the way you want them to be. Here is someone’s configuration files for a sharded cluster with replica sets.

I haven’t gone down that route for now either since I have the startup script in rc.local and we are having a one-shard setup which is really simple. I am probably going to change the setup soon so more updates on that front later.

Major lessons learned:

1. think more during an outage — don’t just think everything is fine even if on the surface things are fine — give it more thought

2. always try to prove your own conclusions — like if you think localhost isn’t being recognized, there are plenty of ways of verifying that speculation.

3. read the logs — read all the logs you can find

4. be persistent and ask for help — one person has limitations and other people can offer helpful and different ways of thinking about the problem

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

July 2012
M T W T F S S
« Jun   Aug »
 1
2345678
9101112131415
16171819202122
23242526272829
3031  

Twitter

  • is emptying trash and happily discovering the available disk space now ranks at 100G+. 4 days ago
  • is looking at other people's intentions, not their capabilities, and feeling much happier every day :) 11 months ago
  • is planning on how to spend the next two weeks until the new year, at home. 11 months ago
  • is going to have human company for Thanksgiving; a rare event for the past like twenty years. 1 year ago
  • living by myself again; a strange feeling in a bustling city with thirteen million people 1 year ago

Flickr Photos

IMG_3517

IMG_3515

IMG_3505

IMG_3497

IMG_3261

IMG_3260

IMG_3255

IMG_2736

IMG_2733

IMG_2629

More Photos
%d bloggers like this: