[rabbitmq-discuss] If 2 nodes out a 3 node cluster, the third one becomes unresponsive until one of the nodes is brought back.

Emile Joubert emile at rabbitmq.com
Wed Aug 7 10:32:55 BST 2013


On 06/08/13 17:53, Yamil Einar Asusta Santos wrote:

> I have put the log in
> here: https://gist.github.com/elbuo8/e5171ec85608b7bd7842

The error recorded on 6-Aug-2013::16:29:44 looks like it might be part
of the problem. Can you tell by looking at the other logfiles whether
anything noteworthy happened at that time? Is anything recorded in the
sasl log?

Are you able to reproduce this issue reliably? And does the last
remaining node always have an log entry containing
"{rabbit_node_monitor,handle_info,2}" ?

Would it be possible for you to run an instrumented version of the
broker to help discover the cause of the problem? It will help alot if
you can reproduce the problem with instrumented code in a test environment.

You may wish to configure "cluster_partition_handling" to "ignore" as a
temporary workaround. This should allow the last node to keep running
while the others are off-line.




-Emile






More information about the rabbitmq-discuss mailing list