[rabbitmq-discuss] Stats node getting slow
Simon MacMullen
simon at rabbitmq.com
Mon Nov 11 09:45:00 GMT 2013
On 08/11/2013 5:32PM, Pierpaolo Baccichet wrote:
> Things have been working great for almost a year but recently we see,
> unfortunately quite frequently, the stats node all of a sudden starting
> consuming considerable amount of memory munching to death. Last night we
> had again one of these events and the stat node consumed more than 10GB
> of RAM in a matter of 2 hours until I caught it and restart it, which
> brought the cluster back into normal state. We started observing this
> issue a few weeks back (not very frequently) while on 3.1.3 and we still
> see it on 3.2.0. If it helps narrowing down the problem, I don't think I
> ever saw this problem with the 3.0.x series so I am considering to go
> back to those versions but i would like to get an opinion first
Hmm, the stats DB was more or less rewritten between 3.0.x and 3.1.0 (to
keep stats histories). If there's a memory leak in there I'd very much
like to get to the bottom of it.
So to get started, would you be able to send me the output of
$ rabbitmqctl eval '[{T, [KV || KV = {K, _} <- I, lists:member(K, [size,
memory])]} || T <- ets:all(), I <- [ets:info(T)],
proplists:get_value(name, I) =:= rabbit_mgmt_db].'
along with
$ rabbitmqctl report
when in this state?
In terms of mitigating this problem:
$ rabbitmqctl eval 'exit(global:whereis_name(rabbit_mgmt_db), die).'
will kill / restart the mgmt DB, so you don't need to restart the entire
stats node.
It might also be worth setting the force_fine_statistics configuration
item to false for the rabbitmq_management_agent application (would need
to be done on every node). This will disable calculation / display of
message rates, which stands a good chance of disabling whatever code is
causing the problem.
Cheers, Simon
--
Simon MacMullen
RabbitMQ, Pivotal
More information about the rabbitmq-discuss
mailing list