[rabbitmq-discuss] HA active/active cluster in a bad state
Matthew Sackman
matthew at rabbitmq.com
Tue Oct 4 22:59:17 BST 2011
Hi Bryan,
On Tue, Oct 04, 2011 at 03:57:09PM -0500, Bryan Murphy wrote:
> This brought the server back up. However, it's not functioning correctly.
> For example, sudo rabbitmqctl cluster_status works fine:
>
> Cluster status of node 'rabbit at domU-12-31-38-07-18-A6' ...
> [{nodes,[{disc,['rabbit at domU-12-31-38-07-18-A6','rabbit at ip-10-202-209-83',
> 'rabbit at domU-12-31-39-06-72-50']}]},
> {running_nodes,['rabbit at domU-12-31-39-06-72-50','rabbit at ip-10-202-209-83',
> 'rabbit at domU-12-31-38-07-18-A6']}]
> ...done.
>
> however, sudo rabbitmqctl list_queues blocks and never returns.
>
> I'm not touching anything else while the cluster is in this state. What
> diagnostics can I provide to help track down this problem?
Ok, well you can Ctl-C the list_queues. On one of the other nodes, what
does rabbitmqctl cluster_status return?
How big were the queues? We recently fixed some bugs which had
previously been causing queue recovery to take a _very_ long time so it
might be one of those that's afflicting you. What is the CPU/disk doing
of the "stuck" node? If it's spinning then it's probably just taking a
very long time to recover.
Matthew
More information about the rabbitmq-discuss
mailing list