[rabbitmq-discuss] Clustered startup with multiple queues and multiple masters

Matt Pietrek mpietrek at skytap.com
Tue Jun 12 18:29:52 BST 2012


Looking for some clarification here.

As I understand from other messages on this forum, in a clustered
setup, the last node shut down should be the first node set up. Again
(in my possibly incorrect assumption), this is because Rabbit and/or
Mnesia may wait for what they believe to be the previous master to
come up first. By starting up the "master" first, any blocking/waiting
can be avoided. In addition, message loss can be avoided by preventing
a prior out-of-sync slave from becoming the master.

Now, consider a situation like this, where there are N queues that are
mastered on different brokers (e.g, rabbit at play, rabbit at play2). If we
pulled the power cord on all these machines, what should the node
startup order be?

real_cm rabbit at play +2  HA D Active 0 0 0
aliveness-test rabbit at play  Active 0 0 0
carbon rabbit at play +2  HA D Idle 0 0 0
cmcmd rabbit at play +2  HA D Idle 0 0 0
fake_cm rabbit at play2 +2  HA D Idle 0 0 0
fake_mu_queue rabbit at play2 +2  HA D Idle 0 0 0
fake_service_2 rabbit at play +2  HA D Idle 0 0 0
random rabbit at play +2  HA D Idle

And at the risk of asking a broader question, what is the recommended
approach to restarting from a catastrophic power failure where all
nodes go down within a very short period of time?

In our experiments with RabbitMQ 2.82, Ubuntu 10.04 and Erlang R13B03,
it's a total crap shoot whether the cluster comes back up or hangs
with all nodes stuck at the "starting database...." point.

Thanks,

Matt


More information about the rabbitmq-discuss mailing list