<div dir="ltr">Hello there,<div><br></div><div>I have been having some issues with the stats node recently getting slow in a clustered environment, stopping to report stats and causing the whole cluster to get extremely slow.</div>
<div><br></div><div>Our setup: we use rabbitMQ (recently updated to 3.2.0) with 10 nodes in a cluster in the same DC and 3 disk nodes. We have a fairly high number of connections in the tens of thousands but fairly low throughput of messages (single digit thousands per second).</div>
<div><br></div><div>Things have been working great for almost a year but recently we see, unfortunately quite frequently, the stats node all of a sudden starting consuming considerable amount of memory munching to death. Last night we had again one of these events and the stat node consumed more than 10GB of RAM in a matter of 2 hours until I caught it and restart it, which brought the cluster back into normal state. We started observing this issue a few weeks back (not very frequently) while on 3.1.3 and we still see it on 3.2.0. If it helps narrowing down the problem, I don't think I ever saw this problem with the 3.0.x series so I am considering to go back to those versions but i would like to get an opinion first<span id="dbph-1"></span></div>
</div>