[rabbitmq-discuss] Delay between minority detected and stopped server

Simon MacMullen simon at rabbitmq.com
Tue Aug 13 15:20:14 BST 2013


Hi.

The "badarg" error you posted is cosmetic, but should probably be fixed. 
It can happen when a node gets two nodedown notifications in rapid 
succession, both of which push it into a minority.

I'm not able to say what happened to make it take that long to pause, 
but be aware that the pause is (nearly) a complete shutdown so it's not 
inconceivable it could take some time. Was there anything in the logs 
between 13-Aug-2013::11:02:31 and 13-Aug-2013::11:03:46?

When you say "the aforementioned bug" which are you talking about?

Cheers, Simon

On 13/08/2013 10:29AM, Malte Schirmacher wrote:
> Hi,
>
> we are using rabbitmq 3.1.4 with this bugfix [1] applied.
>
> Playing around with the clustering features i came across the following
> situation.
> I killed 2 out of 3 machines from the cluster. As expected the remaining
> machine detected its minority and due to the patch it stopped itself.
> But there was a delay between the minority detection and stopping the
> server as the following log entries show:
>
> =WARNING REPORT==== 13-Aug-2013::11:02:31 ===
> Cluster minority status detected - awaiting recovery
>
> =ERROR REPORT==== 13-Aug-2013::11:02:31 ===
> Error in process <0.2836.0> on node 'rabbit at rabbit-test-3' with exit
> value:
> {badarg,[{erlang,register,[rabbit_outside_app_process,<0.2836.0>],[]},{rabbit_node_monitor,'-run_outside_applications/1-fun-0-',1,[{fil
>
> e,"src/rabbit_node_monitor.erl"},{line,391}]}]}
>
>
> =INFO REPORT==== 13-Aug-2013::11:03:46 ===
> stopped STOMP TCP Listener on [::]:61613
>
> =INFO REPORT==== 13-Aug-2013::11:03:46 ===
> stopped TCP Listener on [::]:5672
>
>
> Yet i was unable to reproduce this situation again. But still i'm
> worried as i kind of lost confidence in the clustering abilities of
> rabbitmq due to the aforementioned bug (lost messages, lost queues, lost
> HA-policies, you name it)
>
> Anyone able to tell me what went wrong here?
>
> Thanks in advance
>    malte
>
>
> [1] http://hg.rabbitmq.com/rabbitmq-server/rev/be0b06386a8c
> --
> Geschaeftsanschrift/Business Address: crosscan GmbH | Ruhrstraße 48 |
> 58452 Witten | Germany
> Support: +49.2302.28232-22 Phone: +49.2302.28232-00 Fax:
> +49.2302.28232-09 Geschaeftsfuehrung/Management Board: Philip Lehmann,
> Erwin Berg, Ulrich Kellner
> Sitz Witten, Amtsgericht Bochum, HRB 8036/Registered Office Witten,
> Commercial Register of the Bochum County Court, HRB 8036
> UST-ID-Nr./VAT-IdNo.: DE234398770
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

-- 
Simon MacMullen
RabbitMQ, Pivotal


More information about the rabbitmq-discuss mailing list