[rabbitmq-discuss] RabbitMQ Queues memory leak?
Emile Joubert
emile at rabbitmq.com
Fri Apr 19 17:39:09 BST 2013
Hi,
On 19/04/13 10:23, Dmitry Saprykin wrote:
> The difference between 1Mb and 180Mb is relatively large, even after
> taking expected differences due to garbage collection into account. We
> can't rule out a memory leak, but need some assistance from you to
> confirm.
>
> Do you see the same asymmetry if the master node for the queues switch
> from one node to the other? So if you shut the cdaemon2 node, let
> cdaemon4 become the master for all the queue, turn cdaemon2 back on (it
> will now be a slave node) does the memory on cdaemon2 now grow?
>
>
> Yes, after current master stops and starts it becomes slave and its
> memory starts to grow. Meantime new selected master memory does not
> become free. So new master memory stops to grow but do not fall back
> to normal. I have attached memory graphs of our nodes to this
> letter.
Thanks for confirming.
> Have you been able to add a third node to the cluster for testing
> purposes to see if memory grows on more than one slave node?
>
>
> We have not tied to do this yet. But if it can help we can allocate
> one more node. Is is ok to create test node at the same physical host
> as one of existing nodes?
This is probably not necessary. Based on the information you have
provided we have identified a problem which could be the cause.
> How long does it take for the memory use to reach the VM memory high
> watermark?
>
>
> Critical point for our cluster comes much more earlier than VM
> memory high watermark. The same time with memory grow slave node
> starts to use CPU more and more active. In our case when memory
> consumption reaches ~1Gb broker stops to respond.
>
> After slave restart memory grows linearily some time. After that
> memory grow changes its pattern. At some moments it increases by
> constant step (~20Mb). I have marked these steps on graphs attached.
The linear increase in memory use strongly suggests a memory leak.
Thanks for the detailed information and graphs.
> Can you describe your messaging pattern in a bit more detail for us to
> reproduce the problem - how often are new channels created when
> publishing?
> 2) Create channel
> 3) Create channel
It it likely that the high turnover of channels is a critical
precondition for this leak.
> In order to investigate further it might be helpful to execute some
> diagnostic commands on the broker. Are you able to replicate the problem
> in a staging or QA environment where it is safe to do this?
>
>
> I will execute diagnostic commands on the broker. If something goes
> wrong our messaging falls back to version without rabbitMQ involved
> ).
Please be aware that this command could produce a large amount of
output. It should be run on the slave node:
rabbitmqctl eval 'hd(lists:reverse(lists:sort([{process_info(Pid,
memory), Pid, process_info(Pid)} || Pid <- processes()]))).'
Please pipe the output to a file, compress and email to me offlist.
Thanks again
-Emile
More information about the rabbitmq-discuss
mailing list