[rabbitmq-discuss] Possible memory leak in the management plugin

Wed Apr 9 17:33:56 BST 2014

On 09/04/14 16:51, Pavel wrote:
> Clearly we have some memory missing: ETS tables report ~225Mb,
> process_info(rabbit_mgmt_db, memory) reports ~2Gb, rabbitmqctl status
> reports ~3.8Gb for mgmt_db.
>
> Q5: What else is included in mgmt_db size when reported by rabbitmqctl
> status?

Nothing. But ets:info(Table, memory) reports a number of words, not 
bytes. So it needs to be multiplied by 8 (assuming a 64 bit system). 
rabbitmqctl status does that for you, hence ~3.8Gb.

> Running garbage collection
> (erlang:garbage_collect(global:whereis_name(rabbit_mgmt_db))) did instantly
> reduce the mgmt_db size:
>
> [root at lab-rmq02 pmaisenovich]# /usr/sbin/rabbitmqctl status | grep mgmt_db
>        {mgmt_db,3853521792},
> [root at lab-rmq02 pmaisenovich]# /usr/sbin/rabbitmqctl eval
> 'erlang:garbage_collect(global:whereis_name(rabbit_mgmt_db)).'
> true
> ...done.
> [root at lab-rmq02 pmaisenovich]# /usr/sbin/rabbitmqctl status | grep mgmt_db
>        {mgmt_db,1804503848},
>
> And immediately cleaned up rabbit_mgmt_db memory (even too much so):
>
> {{memory,2069410400},
>   [{5734484,1046,205243},
>    {5738585,5,906},
>    {5742682,2006,136495},
>    {5746779,1,175},
>    {5750876,1,175},
>    {5754973,1,1059},
>    {5759070,916685,51561447},
>    {5763167,1819614,45509178},
>    {5767264,1819847,127559980}]}
> ...done.
> {{memory,5960},
>   [{5734484,1046,205243},
>    {5738585,5,906},
>    {5742682,2006,136495},
>    {5746779,1,175},
>    {5750876,1,175},
>    {5754973,1,1059},
>    {5759070,916685,51561447},
>    {5763167,1819614,45509178},
>    {5767264,1819847,127559980}]}
> ...done.
>
> Q6: In the last snapshot above process_info(rabbit_mgmt_db, memory) is much
> smaller than ETS numbers right below it. Are those not included in the
> process memory calculation?

No, they're not. The "rabbitmqctl status" output adds the process memory 
and ETS memory together.

> Q7: Note, that ETS table sizes didn't go down at all. Isn't GC supposed to
> clean those up?

No, the GC is only for the process memory. ETS is more like a database, 
rows get deleted when you tell it to.

> Furthermore, in 4 seconds after manual GC run, the
> process_info(rabbit_mgmt_db, memory) went up from 5960 to 783979640 and kept
> growing up to a certain (different to previous) limit.

Ouch.

So I think there are two issues here:

1) The combination of large numbers of exchanges / queues being 
published to by large numbers of channels is something that the 
management database inherently struggles with; it maintains data for all 
those combinations, so if you have huge numbers of them, you'll have 
huge amounts of data.

Currently there's no switch to say "I just want message rates per queue, 
exchange or channel, not per combination of those things". If there 
were, it would allow the mgmt database to cut its memory usage a lot 
(although it would still be dealing with a lot of incoming data). So 
your only option is to switch off message rates altogether, with 
{force_fine_statistics, false}.

2) The management database is not getting GCed enough by Erlang when it 
is very busy. It's probable that we could do something about that, 
although the fact that the amount of garbage goes back up so quickly is 
alarming.

So in the short term, {force_fine_statistics, false} is the best option 
you have. In the longer term we may be able to have the mgmt plugin 
still display some message rates at a lower cost.

Cheers, Simon

-- 
Simon MacMullen
RabbitMQ, Pivotal