[rabbitmq-discuss] Delay between minority detected and stopped server

Malte Schirmacher mas at crosscan.com
Tue Aug 13 10:29:35 BST 2013


we are using rabbitmq 3.1.4 with this bugfix [1] applied.

Playing around with the clustering features i came across the following 
I killed 2 out of 3 machines from the cluster. As expected the remaining 
machine detected its minority and due to the patch it stopped itself. 
But there was a delay between the minority detection and stopping the 
server as the following log entries show:

=WARNING REPORT==== 13-Aug-2013::11:02:31 ===
Cluster minority status detected - awaiting recovery

=ERROR REPORT==== 13-Aug-2013::11:02:31 ===
Error in process <0.2836.0> on node 'rabbit at rabbit-test-3' with exit 

=INFO REPORT==== 13-Aug-2013::11:03:46 ===
stopped STOMP TCP Listener on [::]:61613

=INFO REPORT==== 13-Aug-2013::11:03:46 ===
stopped TCP Listener on [::]:5672

Yet i was unable to reproduce this situation again. But still i'm 
worried as i kind of lost confidence in the clustering abilities of 
rabbitmq due to the aforementioned bug (lost messages, lost queues, lost 
HA-policies, you name it)

Anyone able to tell me what went wrong here?

Thanks in advance

[1] http://hg.rabbitmq.com/rabbitmq-server/rev/be0b06386a8c
Geschaeftsanschrift/Business Address: crosscan GmbH | Ruhrstraße 48 | 58452 Witten | Germany
Support: +49.2302.28232-22 Phone: +49.2302.28232-00 Fax: +49.2302.28232-09 
Geschaeftsfuehrung/Management Board: Philip Lehmann, Erwin Berg, Ulrich Kellner
Sitz Witten, Amtsgericht Bochum, HRB 8036/Registered Office Witten, Commercial Register of the Bochum County Court, HRB 8036
UST-ID-Nr./VAT-IdNo.: DE234398770

More information about the rabbitmq-discuss mailing list