[rabbitmq-discuss] RabbitMQ 3.0.2 stops logging, but otherwise looks healthy?

Matthias Radestock matthias at rabbitmq.com
Thu Jul 18 03:28:42 BST 2013


Matt,

On 17/07/13 18:55, Matt Pietrek wrote:
> Interestingly, on the broker that still logged, I see this message at
> the time of the last log entry of the non-logging machine:
>
> =INFO REPORT==== 2-Jun-2013::11:50:57 ===
> rabbit on node rabbit at foobar up
>
> (again, where foobar is obfuscated).
>
> Digging around some other logs at the time, I see there was a
> mnesia/network split issue just proceeding this. However, the broker now
> looks to be happily a part of the cluster, participating in mirrored
> queues, and with a reported uptime matching that of the "INFO REPORT" above.

Hmm. I wonder whether the cluster didn't fully heal. Since you a running 
3.0.x, you do not have the new (>=3.1.0) cluster_partition_handling 
strategy setting available to you, so almost certainly have a 
half-formed cluster now. This is also born out by the fact that...

>     23:14 PROD mpietrek at foomq1:/proc/16050$ sudo ls -latr fd
>     total 0
>     l-wx------ 1 foobar foobar 64 2013-06-02 11:50 2 ->
>     /foobar/logs/foomq1.foo.bar.com/rabbitmq-server.log
>     /foobar/logs/foomq1.foo.bar.com/rabbit at foomq1sasl.log
>     l-wx------ 1 foobar foobar 64 2013-06-02 11:50 7 ->
>     /foobar/logs/foomq1.foo.bar.com/rabbit at foomq1.log
>     /foobar/var/lib/rabbit at foomq1/msg_store_persistent/397.rdq
>     /foobar/var/lib/rabbit at foomq1/msg_store_transient/0.rdq

...*no* files got written to after 2013-06-02 11:50, not just log files 
but also none of the files associated with storing persistent and paged 
messages.

>  how we might kick things without restarting the broker?

It looks like that node is not participating fully in the cluster, so 
you have little to lose by restarting it. You may in fact have to reset 
it and re-join it to the cluster.

Regards,

Matthias.


More information about the rabbitmq-discuss mailing list