[rabbitmq-discuss] RabbitMQ crash in production - help diagnose from log

Matthias Radestock matthias at rabbitmq.com
Sat Jul 2 13:13:27 BST 2011


Bryan,

On 02/07/11 02:56, Bryan Alves wrote:
> A production instance of RabbitMQ crashed on us yesterday, and we have
> no idea why.  RabbitMQ 2.3.1 & Erlang R14B01

Is that the first time it happened? And for how long had that rabbit 
been running when it perished?

> Attached is the SASL log and erl_crash.dump file.

Thanks. It looks like the crash might be triggered by a rare race 
condition. The message store gc is trying to combine two files, one of 
which appears to have vanished from its index. It should be impossible 
to end up in that situation. Nothing obvious jumps out at me from the 
code, so this may take a while to track down.

> Any advice about what could  have caused the crashed and how to avoid it
> in the future would be appreciated.

I suggest you upgrade to 2.5.1. There have been some changes in that 
area of the code, though nothing specifically related to the above. So I 
doubt the problem will go away with 2.5.1, but it would help to find out 
for sure.

Thanks for bringing this problem to our attention.

Regards,

Matthias.


More information about the rabbitmq-discuss mailing list