[rabbitmq-discuss] RabbitMQ crash in production - help diagnose from log
bryanalves at gmail.com
Sat Jul 2 16:05:20 BST 2011
This is the first time it happened, and the system had been in production
for a few months. The load (in messages/sec) had maybe been up a bit
recently, and become a little spikey, but still on the order of dozens to
low hundreds per second.
If upgrading is likely to fix the situation we can try setting that up. I'd
like to avoid that effort if its not going to solve this specific problem
(not a huge fan of upgrading "just because")
Also, it's probably obvious from the report, we can't reproduce the problem.
Thanks for taking a look at this. If there is anything else you need from
me, let me know.
On Jul 2, 2011 8:13 AM, "Matthias Radestock" <matthias at rabbitmq.com> wrote:
> On 02/07/11 02:56, Bryan Alves wrote:
>> A production instance of RabbitMQ crashed on us yesterday, and we have
>> no idea why. RabbitMQ 2.3.1 & Erlang R14B01
> Is that the first time it happened? And for how long had that rabbit been
running when it perished?
>> Attached is the SASL log and erl_crash.dump file.
> Thanks. It looks like the crash might be triggered by a rare race
condition. The message store gc is trying to combine two files, one of which
appears to have vanished from its index. It should be impossible to end up
in that situation. Nothing obvious jumps out at me from the code, so this
may take a while to track down.
>> Any advice about what could have caused the crashed and how to avoid it
>> in the future would be appreciated.
> I suggest you upgrade to 2.5.1. There have been some changes in that area
of the code, though nothing specifically related to the above. So I doubt
the problem will go away with 2.5.1, but it would help to find out for sure.
> Thanks for bringing this problem to our attention.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rabbitmq-discuss