[rabbitmq-discuss] Active/Active: shutdown of one service brings down the cluster

Vadim Chekan kot.begemot at gmail.com
Thu Feb 9 02:42:56 GMT 2012


Hi all,

Given: 3 servers in active/active configuration, rabbit: 2.7.1, erlang
R14B03, CentOS, 64bits.
We experienced at least 2 occasions of the following situation: we observe
abnormal high CPU utilization on one of rabbit servers (40% when <10% is a
norm) without any obvious reason. We did nice restart rabbit service and
the whole cluster went down (restarted).
Another effect is that cluster seems to enters some (inconsistent?) state
and queues can not be registered/deleted, list of queues can not be viewed
through management UI, etc.

Here is a log which contains errors when cluster in broken state:
http://pastebin.com/6rweU3MD

Questions:
  Are those errors critical?
  If we experience a high CPU situation again, what can we do, any
additional logging, profiling, process snapshots, etc?

Vadim.

-- 


More information about the rabbitmq-discuss mailing list