[rabbitmq-discuss] RabbitMQ Crash

Dave Greggory davegreggory at yahoo.com
Mon Nov 14 15:17:11 GMT 2011


We're running RabbitMQ 2.5.1 in production and it crashed last week. Crash = Dropped all connections and process died (i.e. rabbitmqctl status and ps -ef | grep rabbitmq both showed that the process itself was not running). 

Setup -
2 non-clustered RabbitMQ nodes behind a load balancer with only 1 node being active in the load balancer (2nd one is there for failover in case situations like this). 
No special config (All defaults)
Only Management plugin used. 

CentOS Linux 2.6.18-164.el5 x86_64
Erlang R13B04

Clients mostly Java-library 2.5.1 (some older apps with 2.3.1 and 2.0.0 clients). 

I attached the logs (errors starting at 8-Nov-2011::18:13:02 mark). I also backed up mnesia in its error state and can provide if needed. There's an erl_crash.dump which I can provide if needed as well. No CPU or memory spikes.

I was able to start it up again no problem. Just saw the following in the logs at startup (which seemed unusual).

   =WARNING REPORT==== 8-Nov-2011::18:42:03 ===

   msg_store_persistent: recovery terms differ from present
   rebuilding indices from scratch


We did not upgraded to 2.6.1 because it seemed to be a buggy release (purely based on this mailing list)... i.e. HA nodes config didn't work, etc. I will upgrade our QA environment to 2.7.0 shortly but probably will not go live with 2.7.0 for at least a month (to verify stability). In the mean time, we'd like to understand what happened.

We've been pretty happy with RabbitMQ so far, and it has been quite stable on the server side (the java client library is another issue, but that's for another topic). I'm excited to start using HA functionality in 2.7.0. 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: incident-logs.tar.gz
Type: application/x-gzip
Size: 111956 bytes
Desc: not available
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20111114/c0d57e3f/attachment.bin>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: report-minimal-post-restart.txt
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20111114/c0d57e3f/attachment.txt>


More information about the rabbitmq-discuss mailing list