[rabbitmq-discuss] RabbitMQ 2.6.1 is unexpectedly eating up memory
schramm.ingo at googlemail.com
Tue Mar 6 11:22:01 GMT 2012
I'm running a RabbitMQ cluster with 4 nodes - 2 disc, 2 ram - on two
machines with some thousands of clients connected. I use the system as
a job queue, having 1 queue on 1 of the ram nodes for delivering jobs
to workers (mirrored queue unfortunately doesn't perform that well)
and a number of queues distributed over all nodes for responses. Only
the ram nodes have clients connected.
All worked well for months now.
This night I encountered a very strange condition. Since I send a
number of metrics to Ganglia I have a reasonable good insight into
what happened when. I could see that *first* of all Rabbit started to
consume more and more RAM. A huge amount - until the limit was reached
- on the node with the delivering queue, some time later a smaller but
still visible amount on the other ram node. The traffic at that time
was not unusual and not higher than before. When the RAM limit was
reached, the whole system began to break down like in domino theory. I
really have no idea what could have triggered this condition. Some
little Erlang process not consuming its message queue correctly?? But
why does it work most of the time? All I can see in the logs is that,
=INFO REPORT==== 5-Mar-2012::19:00:08 ===
starting TCP connection <0.11787.218> from 22.214.171.124:40846
=INFO REPORT==== 5-Mar-2012::23:55:12 ===
vm_memory_high_watermark set. Memory used:15678711560 allowed:
A usual RAM usage under load on that node is < 1GB. Average message
size is 1-2K.
All I can say is that at the time in question most clients were
consuming slowly, but this is a condition happening from time to time
without any problems. I cannot see something queueing up before Rabbit
started to eat memory, only afterwards, when all began to break down.
One thing I can see is a larger number of unacked messages (up to 2K)
rising in conjunction with the memory consumption. But this also
happened before without having a memory problem.
Do you have any hint what I can do to debug the problem?
More information about the rabbitmq-discuss