[rabbitmq-discuss] rabbit disk_mode branch eating up all RAM, including swap, dying
Brian Whitman
brian at echonest.com
Sun Oct 4 14:03:28 BST 2009
Hi, we're using 184cb96f7846+ (bug20980) and our host alerted us that rabbit
was eating up all available swap on a 16GB real + 8GB swap machine.
"""
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ SWAP COMMAND
18445 rabbitmq 18 0 24.7g 14g 1696 S 1087.1 91.7 2268:18 10g beam.smp
In an effort to prevent kernel panic, we restarted the rabbitmq service,
freeing up a considerable amount of swap:
However, the rabbitmq server is not starting again as expected, due to the
following exception:
2009-10-04 06:26:29.797201500 {"init terminating in
do_boot",{{nocatch,{error,{cannot_start_application,rabbit,{{timeout_waiting_for_tables,[rabbit_disk_queue]},{rabbit,start,[normal,[]]}}}}},[{init,start_it,1},{init,start_em,1}]}}
"""
They had to delete the mnesia folder (losing all our disk-backed queues) and
restart\, now it's fine. I would guess that this breakage coincided with us
storing quite a large number of unacked messages in the queues (job
instructions for a very large batch)
a) Would upgrading this branch fix this? We were avoiding doing so because
things were relatively stable.
b) is there anything else I can look at to debug? The logs don't have
anything of importance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20091004/a1ddc7e7/attachment.htm
More information about the rabbitmq-discuss
mailing list