[rabbitmq-discuss] Management Plugin stops gathering stats ?
Rene Parra
rparra at homeaway.com
Tue Aug 23 20:59:20 BST 2011
To remove variables, we have dropped off the ram nodes, and have reduced the
cluster to only one disk node.
So, presumably, we are seeing this on the collector node.
(The jing video shows the stats database is deployed to: diskrabbit01-test |
the collector node [lower left hand side of the video])
I am making one more tweak in my test and after tomorrow, if that doesn't
work or no one has any other ideas, we will upgrade to 2.5.1.
--rparra
On 8/23/11 2:54 PM, "Aaron Westendorf" <aaron at agoragames.com> wrote:
> When we load tested on 2.4 we found that the collector plugin grew in
> memory until rabbit was OOMkilled. Even though it was on a separate
> cluster node, the problem was repeatable and so we dropped it use.
>
> Are you seeing this on the reporting nodes, or collector node?
>
> -Aaron
>
>
> On Mon, Aug 22, 2011 at 1:31 PM, Rene Parra <rparra at homeaway.com> wrote:
>> Hi everyone...
>>
>> I¹ve been performing a load/soak test on RabbitMQ before we promote our
>> configuration into production, and have run into a little bit of a snag.
>>
>> The Management Plugin seems to gather statistics on the test just fine, but
>> after about ~6min to about ~12min into the soak test, the management plugin
>> seems to STOP collecting stats on my queues!! Thereafter, the throughput
>> seemingly (but falsely) goes to zero. The test is actually proceeding just
>> fine and queues are happily receiving, delivering, and receiving acks for
>> their messages.
>>
>> Restarting the disk node seems to clear the management plugin to resume
>> collecting statistics.
>>
>> My question is: Is there some upper threshold in the management plugin that
>> I need to set to continue gathering statistics throughout my test ? This
>> feels like a classic ³invisible threshold being hit somewhere² problem.
>>
>> Here¹s a snapshot of the management UI screen working as expected:
>>
>> http://i.imgur.com/8eXNM.png
>>
>> Here¹s a ScreenCast of the management UI working and then ³FAILS²:
>>
>> http://goo.gl/jrvAP
>>
>> (please fast forward to these time points in the screencast)
>>
>> @2m 16s: The system runs normally and then I demo the queues stats
>> collection working perfectly.
>> @2m 56s: Connection View shows normal operating behavior.
>> @3m 34s: The ³weird behavior² begins... it seems that TPS drops off to 30
>> TPS then 0 TPS!!
>> @4m 22s: Connection View showing throughput has ³stopped².
>>
>> External metric gathering confirms the test still ³runs² and messages are
>> being delivered successfully.
>>
>> Has anyone else experienced this weirdness with the mgmt plugin?
>>
>> Any help would be greatly appreciated.
>> -rparra
>>
>> _______________________________________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.rabbitmq.com
>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>>
>>
>
>
More information about the rabbitmq-discuss
mailing list