[rabbitmq-discuss] Repairing a a crashed cluster

Dave Seltzer dseltzer at tveyes.com
Wed Oct 10 14:49:57 BST 2012


Hi everyone,

I have a RabbitMQ cluster with two disk nodes.

Last night something happened to the network and both RabbitMQ servers
crashed.

node1 reports the following:

> BOOT FAILED
> ===========
>
> Timeout contacting cluster nodes: [rabbit at node2].

node2 reports the reciprocal

> BOOT FAILED
> ===========
>
> Timeout contacting cluster nodes: [rabbit at node1].


My instinct, for the sake of uptime, is to say: "okay, forget node2, lets
break the cluster and bring node1 online".

My problem is that according to the docs I need to issue a "rabbitmqctl
force_reset", which I can't do unless the server is running.

I tried starting it using "rabbitmq-server -detached" but the server just
exited after loading plugins.

Does anyone know the right course of action in this scenario?

Many Thanks!

-Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20121010/364c5858/attachment.htm>


More information about the rabbitmq-discuss mailing list