I have a 3-node HA active/active cluster in our integration environment that is currently in a bad state.<div><br></div><div>I logged into one node, and did an /etc/init.d/rabbitmq-server restart. This never returned. I logged into another terminal, kill -9 ALL rabbitmq processes and then ran /etc/init.d/rabbitmq-server start.</div>
<div><br></div><div>This brought the server back up. However, it's not functioning correctly. For example, sudo rabbitmqctl cluster_status works fine:</div><div><br></div><div><font class="Apple-style-span" face="'courier new', monospace">Cluster status of node 'rabbit@domU-12-31-38-07-18-A6' ...</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace">[{nodes,[{disc,['rabbit@domU-12-31-38-07-18-A6','rabbit@ip-10-202-209-83',</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> 'rabbit@domU-12-31-39-06-72-50']}]},</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace"> {running_nodes,['rabbit@domU-12-31-39-06-72-50','rabbit@ip-10-202-209-83',</font></div><div><font class="Apple-style-span" face="'courier new', monospace"> 'rabbit@domU-12-31-38-07-18-A6']}]</font></div>
<div><font class="Apple-style-span" face="'courier new', monospace">...done.</font></div><div><br></div><div>however, sudo rabbitmqctl list_queues blocks and never returns.</div><div><br></div><div>I'm not touching anything else while the cluster is in this state. What diagnostics can I provide to help track down this problem?</div>
<div><br></div><div>Thanks,</div><div>Bryan</div><div><div><br><br></div></div>