[rabbitmq-discuss] RabbitMQ Cluster - node hanging on join_cluster - mnesia reporting connection issues

Zach Austin zachary.w.austin at gmail.com
Wed Oct 9 21:43:19 BST 2013


The issue was resolved by restarting RabbitMQ on rabbit2.  Not sure why 
this was required, especially after removing, resetting, and re-adding 
rabbit1.

On Wednesday, October 9, 2013 12:50:13 PM UTC-5, Zach Austin wrote:
>
> Hi All,
>
> We're having an issue getting one machine in our rabbit cluster back up 
> and running after a reboot affected two of the 4 servers in the cluster.
>
> Here is the cluster layout:
> rabbit1
> rabbit2
> rabbit3 (master)
> rabbit4
>
> rabbit1 and rabbit2 were rebooted.  Rabbit2 successfully rejoined the 
> cluster.  Rabbit1 did not.  Additionally, the rabbitMQ will no longer start 
> on rabbit1.
>
> Reviewing the log on rabbit1, I find: Mnesia on 'rabbit1' could not 
> connect to node(s) ['rabbit2']
>
> I can ping rabbit1 from rabbit2 and vice-versa.
>
> What I've done so far:
> 1) Verified the erlang cookie values amongst all cluster nodes are 
> identical
> 2) Verified the windows firewall is disabled on all cluster nodes.
> 2) Issued "rabbitmqctl forget_cluster_node rabbit1" on the rabbit3 master.
> 2) Deleted the mnesia database on rabbit1.
> 3) Successfully started RabbitMQ on rabbit1 (deleting mnesia DB did this).
> 4) Issued "rabbitmqctl stop_app", followed by "rabbitmqctl join_cluster 
> rabbit3".
>
> At this point, rabbitmqctl hangs after the "cluster node... with node... " 
> line (I waited over 15 minutes).  Reviewing the log on rabbit1 again, I 
> find the same issue logged: Mnesia on 'rabbit1' could not connect to 
> node(s) ['rabbit2']
>
> Can anyone point me in the direction of what I should check next?
>
> Thank you.
>
> Zach
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20131009/dd8bdc8f/attachment.htm>


More information about the rabbitmq-discuss mailing list