[rabbitmq-discuss] Stats node getting slow

Pierpaolo Baccichet pierpaolo at dropbox.com
Sat Nov 16 04:24:03 GMT 2013


Hello Simon, as promised we did an experiment today reverting to the old
config that enables fine grained stats and got the slowdown reproduced very
quickly. I dumped the output of the commands you asked us to run in the
attached files:

log_rabbit_1.txt: rabbitmqctl eval '[{T, [KV || KV = {K, _} <- I,
lists:member(K, [size, memory])]} || T <- ets:all(), I <- [ets:info(T)],
proplists:get_value(name, I) =:= rabbit_mgmt_db].'

log_rabbit2.txt: rabbitmqctl eval
'process_info(global:whereis_name(rabbit_mgmt_db),
memory).'

I have also the output of the rabbitmqctl report command but it contains a
lot of information that is leaking internal stuff so I can't really forward
it as a whole. Is there something specific you'd like to see from it?


On Mon, Nov 11, 2013 at 6:24 AM, Pierpaolo Baccichet
<pierpaolo at dropbox.com>wrote:

> Hello Simon,
>
> Thanks for the response and yeah, my gut feeling went as well toward a
> leak in that rewrite because in fact this issue started showing up when we
> upgraded to 3.1.x. We ended up disabling the fine-grained stats last friday
> as you mentioned in your last suggestion because people were getting paged
> a bit too often :) The current config is below
>
> [
>     {rabbit, [
>         {vm_memory_high_watermark, 0.75},
>         {cluster_nodes, [
>             'rabbit at xyz1',
>             'rabbit at xyz2',
>             'rabbit at xyz3'
>         ]},
>         {collect_statistics, coarse},
>         {collect_statistics_interval, 10000}
>     ]},
>     {rabbitmq_management_agent, [
>         {force_fine_statistics, false}
>     ]}
> ].
>
> I will give it a few more days with this config and then maybe revert to
> help you figure this issue. A related question, is there a way to
> programmatically figure which one is the stats node in the cluster? I could
> not find the config in the HTTP API
>
>
> On Mon, Nov 11, 2013 at 1:52 AM, Simon MacMullen <simon at rabbitmq.com>wrote:
>
>> On 11/11/2013 9:45AM, Simon MacMullen wrote:
>>
>>> Hmm, the stats DB was more or less rewritten between 3.0.x and 3.1.0 (to
>>> keep stats histories). If there's a memory leak in there I'd very much
>>> like to get to the bottom of it.
>>>
>>
>> Of course the other possibility is that the stats DB is simply
>> overwhelmed with work and unable to keep up. It's supposed to start
>> dropping incoming stats messages in this situation, but maybe it isn't. To
>> determine if this is the case, look at:
>>
>> rabbitmqctl eval 'process_info(global:whereis_name(rabbit_mgmt_db),
>> memory).'
>>
>> - and if the number that comes back looks like it would account for most
>> of the memory used, then that is likely to be the problem. In that case you
>> can slow down stats event emission by changing collect_statistics_interval
>> (see http://www.rabbitmq.com/configure.html) and / or disable
>> fine-grained stats as I mentioned in the previous message.
>>
>>
>> Cheers, Simon
>>
>> --
>> Simon MacMullen
>> RabbitMQ, Pivotal
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20131115/c915c4b9/attachment.htm>
-------------- next part --------------
[{1122389,[{memory,15013779},{size,173767}]},
 {1118292,[{memory,2224189},{size,80306}]},
 {1114195,[{memory,9766146},{size,111877}]},
 {1110098,[{memory,7419},{size,10}]},
 {1106001,[{memory,47570},{size,531}]},
 {1101904,[{memory,47570},{size,531}]},
 {1097807,[{memory,1153628},{size,15446}]},
 {1093710,[{memory,2441124},{size,15493}]},
 {1089558,[{memory,9378},{size,52}]}]
...done.
-------------- next part --------------
{memory,601162240}
...done.


More information about the rabbitmq-discuss mailing list