[rabbitmq-discuss] Delay between minority detected and stopped server
Malte Schirmacher
mas at crosscan.com
Tue Aug 13 10:29:35 BST 2013
Hi,
we are using rabbitmq 3.1.4 with this bugfix [1] applied.
Playing around with the clustering features i came across the following
situation.
I killed 2 out of 3 machines from the cluster. As expected the remaining
machine detected its minority and due to the patch it stopped itself.
But there was a delay between the minority detection and stopping the
server as the following log entries show:
=WARNING REPORT==== 13-Aug-2013::11:02:31 ===
Cluster minority status detected - awaiting recovery
=ERROR REPORT==== 13-Aug-2013::11:02:31 ===
Error in process <0.2836.0> on node 'rabbit at rabbit-test-3' with exit
value:
{badarg,[{erlang,register,[rabbit_outside_app_process,<0.2836.0>],[]},{rabbit_node_monitor,'-run_outside_applications/1-fun-0-',1,[{fil
e,"src/rabbit_node_monitor.erl"},{line,391}]}]}
=INFO REPORT==== 13-Aug-2013::11:03:46 ===
stopped STOMP TCP Listener on [::]:61613
=INFO REPORT==== 13-Aug-2013::11:03:46 ===
stopped TCP Listener on [::]:5672
Yet i was unable to reproduce this situation again. But still i'm
worried as i kind of lost confidence in the clustering abilities of
rabbitmq due to the aforementioned bug (lost messages, lost queues, lost
HA-policies, you name it)
Anyone able to tell me what went wrong here?
Thanks in advance
malte
[1] http://hg.rabbitmq.com/rabbitmq-server/rev/be0b06386a8c
--
Geschaeftsanschrift/Business Address: crosscan GmbH | Ruhrstraße 48 | 58452 Witten | Germany
Support: +49.2302.28232-22 Phone: +49.2302.28232-00 Fax: +49.2302.28232-09
Geschaeftsfuehrung/Management Board: Philip Lehmann, Erwin Berg, Ulrich Kellner
Sitz Witten, Amtsgericht Bochum, HRB 8036/Registered Office Witten, Commercial Register of the Bochum County Court, HRB 8036
UST-ID-Nr./VAT-IdNo.: DE234398770
More information about the rabbitmq-discuss
mailing list