[rabbitmq-discuss] Startup problems after network issues

Emile Joubert emile at rabbitmq.com
Wed Sep 26 15:33:20 BST 2012


Hi Jon,

On 24/09/12 13:17, Jon Bergli Heier wrote:
> During the weekend we had some minor network issues, after which one
> of the cluster nodes crashed and won't start up again.

The error in the logfile {mnesia_locker, ... ,granted} appears as the
result of a netsplit (nodes being unable to communicate). This is not a
situation the broker makes any attempt to deal with. If your cluster
will be subject to network interruptions on a regular basis then
consider some of the other distribution strategies:
http://www.rabbitmq.com/distributed.html

The simplest way of starting the failed node is to reset its database
and rejoin the cluster. The database can be reset by moving the database
directory out of the way. Mirrored queues should not be affected by the
loss of one node (as long as the remaining nodes were synchronised), but
any non-mirrored queues (together with their contents) that were defined
on the failing node will be lost.


-Emile



More information about the rabbitmq-discuss mailing list