[rabbitmq-discuss] RabbitMQ node stopped while broker still up

Simon MacMullen simon at rabbitmq.com
Tue Oct 16 11:33:06 BST 2012


Hmm. The SASL log will not necessarily have to contain anything, it is 
really more of an error log. So I guess there was no error.

Aha! The log contains:

   =INFO REPORT==== 16-Oct-2012::00:48:36 ===
   Halting Erlang VM

We only log that after invocation of "rabbitmqctl stop". So the reason 
that node shut down was, umm, someone told it to.

And regarding network partitions, we get information about that from 
Mnesia. Mnesia will log something like:

   =ERROR REPORT==== 16-Oct-2012::00:04:19 ===
   Mnesia(nplay at app2): ** ERROR ** mnesia_event got
     {inconsistent_database, running_partitioned_network, nplay at web2}

when it has detected a network partition. Note the 
"running_partitioned_network" - it will also log a very similar message 
with "starting_partitioned_network" the first time it starts *after* a 
partition.

Future versions of RabbitMQ will make this information more accessible.

Cheers, Simon

On 16/10/12 10:32, Wong Kam Hoong wrote:
> Hi Simon,
>
> I check in the sasl log, but it not being update for long time, the
> issue happened at 16-oct-2012 but the latest info showed in the log is
> only up to 2-oct-2012.
>
> Attached is the requested sasl log for the server.
>
> Yea, I remember you mentioned before RabbitMQ not recommend to run in
> partitioned network, we still waiting network team to tell us whether
> those RabbitMQs is it really deployed in a partitioned network..
>
> Just curios, how RabbitMQ identify whether the nodes deployed
> in partitioned network? I asked this question so that I can discuss
> better with network team.
>
> Regards,
> Wong
>
>
> On Tue, Oct 16, 2012 at 5:10 PM, Simon MacMullen <simon at rabbitmq.com
> <mailto:simon at rabbitmq.com>> wrote:
>
>     Hi. There's nothing in that log to indicate why the node shut down -
>     can you post the sasl log somewhere?
>
>     I don't know if it's related to the network partition. But please
>     bear in mind that network partitions are really bad for RabbitMQ
>     clusters.
>
>     Cheers, Simon
>
>
>     On 16/10/12 03:20, Wong Kam Hoong wrote:
>
>         Hi RabbitMQ Team,
>
>         This morning while I checked the RabbitMQs status through web
>         admin, I
>         found that one of the RabbitMQ node stopped.
>
>         RabbitMQ v2.8.7
>         Erlang v*R14B04*
>         *Cluster: Yes, 3 RabbitMQs*
>
>
>         Attached is the log for your reference.
>
>         After I restarted the service, then everything back to normal.
>
>         I wonder is the problem related to partitioned network:
>
>         http://rabbitmq.1065348.n5.__nabble.com/Statistics-__database-could-not-be-__contacted-Message-rates-and-__queue-lengths-will-not-be-__shown-td22331.html
>         <http://rabbitmq.1065348.n5.nabble.com/Statistics-database-could-not-be-contacted-Message-rates-and-queue-lengths-will-not-be-shown-td22331.html>
>
>         Thanks & Regards,
>         Wong
>
>
>         _________________________________________________
>         rabbitmq-discuss mailing list
>         rabbitmq-discuss at lists.__rabbitmq.com
>         <mailto:rabbitmq-discuss at lists.rabbitmq.com>
>         https://lists.rabbitmq.com/__cgi-bin/mailman/listinfo/__rabbitmq-discuss
>         <https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss>
>
>
>
>     --
>     Simon MacMullen
>     RabbitMQ, VMware
>
>


-- 
Simon MacMullen
RabbitMQ, VMware


More information about the rabbitmq-discuss mailing list