[rabbitmq-discuss] RabbitMQ hanging on "starting queue supervisor and queue recovery" after system went down (using new persister)

Christian Legnitto clegnitto at mozilla.com
Wed Aug 18 22:36:59 BST 2010


This ended up being my problem, the machine ran out of disk space.

Thanks,
Christian

On Aug 18, 2010, at 4:10 AM, Matthew Sackman wrote:

> Hi Christian,
> 
> On Mon, Aug 16, 2010 at 03:18:57PM -0700, Christian Legnitto wrote:
>> I've been running default with the new persister. I went away for the weekend and saw that my RabbitMQ instance died (not sure what happened, doesn't look to be in logs). In any case, I went to restart the server and it was hanging at "starting queue supervisor and queue recovery". The VM this on isn't speedy but I let it go for ~30 mins. I moved the mnesia db out of the way, tried again,  and it started instantly.
> 
> Hmm. Any idea what it was doing - was the disk thrashing or CPU very
> busy? In the event of a crash, rabbit has to do various checks on start
> up, which can be time consuming, in order to validate the state of the
> messages in the queues. However, I think last time I benchmarked this,
> it was of the order of 100s of thousands per second. Otoh, that's on an
> 8-core machine with oodles of RAM and decent hard drives.
> 
> I'd be curious as to what it was doing.
> 
>> I can send the mnesia db, but it is very large. I didn't have many queues (3?) but each probably had thousands (or even 100k+ messages) queued up.
> 
> That might be useful though the startup/recovery process is at various
> points pruning, so the data you have may very well now be different to
> the data that was there when you restarted Rabbit. You didn't happen to
> take a backup *before* restarting Rabbit did you?
> 
> Matthew



More information about the rabbitmq-discuss mailing list