[rabbitmq-discuss] Stats node getting slow

Simon MacMullen simon at rabbitmq.com
Mon Nov 11 09:45:00 GMT 2013


On 08/11/2013 5:32PM, Pierpaolo Baccichet wrote:
> Things have been working great for almost a year but recently we see,
> unfortunately quite frequently, the stats node all of a sudden starting
> consuming considerable amount of memory munching to death. Last night we
> had again one of these events and the stat node consumed more than 10GB
> of RAM in a matter of 2 hours until I caught it and restart it, which
> brought the cluster back into normal state. We started observing this
> issue a few weeks back (not very frequently) while on 3.1.3 and we still
> see it on 3.2.0. If it helps narrowing down the problem, I don't think I
> ever saw this problem with the 3.0.x series so I am considering to go
> back to those versions but i would like to get an opinion first

Hmm, the stats DB was more or less rewritten between 3.0.x and 3.1.0 (to 
keep stats histories). If there's a memory leak in there I'd very much 
like to get to the bottom of it.

So to get started, would you be able to send me the output of

$ rabbitmqctl eval '[{T, [KV || KV = {K, _} <- I, lists:member(K, [size, 
memory])]} || T <- ets:all(), I <- [ets:info(T)], 
proplists:get_value(name, I) =:= rabbit_mgmt_db].'

along with

$ rabbitmqctl report

when in this state?

In terms of mitigating this problem:

$ rabbitmqctl eval 'exit(global:whereis_name(rabbit_mgmt_db), die).'

will kill / restart the mgmt DB, so you don't need to restart the entire 
stats node.

It might also be worth setting the force_fine_statistics configuration 
item to false for the rabbitmq_management_agent application (would need 
to be done on every node). This will disable calculation / display of 
message rates, which stands a good chance of disabling whatever code is 
causing the problem.

Cheers, Simon

-- 
Simon MacMullen
RabbitMQ, Pivotal


More information about the rabbitmq-discuss mailing list