[rabbitmq-discuss] RabbitMQ Cluster - node hanging on join_cluster - mnesia reporting connection issues

Zach Austin zachary.w.austin at gmail.com
Wed Oct 9 18:50:13 BST 2013

Hi All,

We're having an issue getting one machine in our rabbit cluster back up and 
running after a reboot affected two of the 4 servers in the cluster.

Here is the cluster layout:
rabbit3 (master)

rabbit1 and rabbit2 were rebooted.  Rabbit2 successfully rejoined the 
cluster.  Rabbit1 did not.  Additionally, the rabbitMQ will no longer start 
on rabbit1.

Reviewing the log on rabbit1, I find: Mnesia on 'rabbit1' could not connect 
to node(s) ['rabbit2']

I can ping rabbit1 from rabbit2 and vice-versa.

What I've done so far:
1) Verified the erlang cookie values amongst all cluster nodes are identical
2) Verified the windows firewall is disabled on all cluster nodes.
2) Issued "rabbitmqctl forget_cluster_node rabbit1" on the rabbit3 master.
2) Deleted the mnesia database on rabbit1.
3) Successfully started RabbitMQ on rabbit1 (deleting mnesia DB did this).
4) Issued "rabbitmqctl stop_app", followed by "rabbitmqctl join_cluster 

At this point, rabbitmqctl hangs after the "cluster node... with node... " 
line (I waited over 15 minutes).  Reviewing the log on rabbit1 again, I 
find the same issue logged: Mnesia on 'rabbit1' could not connect to 
node(s) ['rabbit2']

Can anyone point me in the direction of what I should check next?

Thank you.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20131009/176ff5ce/attachment.htm>

More information about the rabbitmq-discuss mailing list