[rabbitmq-discuss] Clustered startup with multiple queues and multiple masters

Wed Jun 20 00:33:05 BST 2012

Francesco,

Thanks again for the valuable insight from your reply. I'm down to one
issue at this point.

Given what you said earlier about it being OK to start the brokers in any
order, I wrote a simple "catastrophic stress" test. The good news is that
RabbitMQ does what's expected. The bad news: Only most of the time, i.e.
about 90%.

Here's what the test does:

In a loop (5 times):
    Write 5 messages to a queue comprised of 3 clustered nodes (disk based).
    Kill all RabbitMQ's as quickly as possible - I use "killall beam.smp"
invoked on all nodes simultaneously via Capistrano.
    Restart all nodes in parallel (again, using Capistrano).
    Verify that each previously written message can be received.

~90% of the time, an iteration runs to completion as expected. About 10% of
the time, one of the nodes fails to start,
getting stuck with the last line being "starting database ..."

I see nothing of note in the rabbit@<node>.log file.

When this happens, I have no luck getting the node to join the cluster.
Even attempting a "rabbitmqctl force_reset" hangs. The only way I can get
the cluster fully formed again is to terminate all the nodes, then bring
them back up. (A reset is not required in this case, however.)

For what it's worth, the broker start logic looks like this
    nohup rabbitmq-server &
    rabbitmqctl wait <pidfile>

Are their any trace options or other places to look to understand why a
node gets stuck on "creating database..." ?

This bit of startup unpredictability is causing a lot of concern with our
VP. I've been tasked with coming up with the complete story, and
workarounds to any problems like this.

Thanks very much again,

Matt

On Tue, Jun 19, 2012 at 1:49 AM, Francesco Mazzoli
<francesco at rabbitmq.com>wrote:

> Hi Matt,
>
> At Mon, 18 Jun 2012 11:16:40 -0700,
> Matt Pietrek wrote:
> >
> > Francesco,
> >
> > Thanks very much for the detailed reply. It was extremely helpful.
>
> Glad I could help.
>
> > Question 1
> > ----------------
> > I stumbled across this in my own previous experimentation. The
> > question is, do I risk message loss by first starting a node that
> > joined the cluster late, thus not having the full set of messages that
> > other nodes have?
>
> No.
>
> > Question 2
> > ---------------
> > Related to the above scenario, is there any danger (after an unplanned
> > shutdown), in simply letting all the nodes start in parallel and
> > letting Mnesia's waiting sort out the order? It seems to work OK in my
> > limited testing so far, but I don't know if we're risking data loss.
>
> It should be fine, but in general it's better to do cluster operations
> sequentially and at one site. In this specific case it should be OK.
>
> > Question 3
> > ---------------
> > You said:
> >
> > > In other words, it's up to you to restart them so that the node with
> the most up-to-date mnesia is started first
> >
> > Is there any information recorded somewhere (e.g. the logs), which
> > would indicate which node has the "most up to date" Mnesia database? I
> > see messages like:
> >
> >  > Mirrored-queue (queue 'charon' in vhost '/'): Promoting slave
> > <rabbit at play2.1.273.0> to master
> >
> > But don't know if they're necessarily correlated to who the most
> > up-to-date Mnesia is.
>
> Well, the fact that something got promoted to master tells you that
> the previous master definitely went down before the promoted node.
>
> There won't be precise information if you stop the nodes abruptly, but
> you can look at "rabbit on node <node> down" messages in the logs -
> the node in the clusters are linked in distributed Erlang and these
> messages are generated when a node receives a 'DOWN' message from
> another node. Since they are "info" messages, you need to make sure to
> have those enabled - they are enabled by default but a lot of people
> don't like them.
>
>
> Francesco.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120619/e680e6aa/attachment.htm>