<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div>We have seen the same behavior but don't have a fix for it. In a 3 node HA cluster we sometimes see node 1 out as seen by node 2, node 2 out as seen by node 1, and node 3 thinks everything is ok. Pivotal Labs was working with us at one point and they didn't have an explanation, either. </div><div><br></div><div>That being said we have had numerous issues getting a stable and reliable 3 node cluster working on Windows Server 2008R2. We don't see the stability issues in our tests with Linux but we won't be running production on Linux rabbit nodes for a couple more weeks. </div><div><br></div><div>Cheers,</div><div><br></div><div>Ron<br><br>Sent from my iPad</div><div><br>On Feb 10, 2014, at 5:33 PM, Matt Pietrek <<a href="mailto:mpietrek@skytap.com">mpietrek@skytap.com</a>> wrote:<br><br></div><blockquote type="cite"><div><div dir="ltr">Recently we started running a two node HA cluster of Rabbit 3.2.2, with autoheal enabled.<div><br></div><div>After a network partition, I noticed that autoheal didn't appear to work, although the logs indicate it was tried. The first time it happened, the UI in both brokers indicated the other broker was missing from the cluster.</div>
<div><br></div><div>The second time this happened, the management plugin seemed to not function afterwards. Most of the Web UI was unusable, i.e it wouldn't tell me which nodes were running, what queues were declared, and so forth.</div>
<div><br></div><div><br></div><div>I'm wondering if what I'm seeing below is a known issue rings any bells. Also, is their any other log output I should look at to determine success/failure?</div><div><br></div><div>
On the "winning" side, the logs look like this. The "ignoring" part in particular is suspicious.</div><div><br></div><div>--------</div><div>
<p class="">=ERROR REPORT==== 3-Feb-2014::09:48:56 ===</p>
<p class="">Mnesia(rabbit@goodnessmq1): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, rabbit@goodnessmq2}</p>
<p class=""><br></p>
<p class="">=INFO REPORT==== 3-Feb-2014::09:48:56 ===</p>
<p class="">Autoheal request received from rabbit@goodnessmq2 when in state {winner_waiting,</p>
<p class=""> [rabbit@goodnessmq2],</p>
<p class=""> [rabbit@goodnessmq2]}; ignoring</p>
<p class=""><br></p>
<p class="">=INFO REPORT==== 3-Feb-2014::09:48:56 ===</p>
<p class="">global: Name conflict terminating {rabbit_mgmt_db,<2783.10073.5>}</p><p class="">--------</p><p class=""><br></p><p class="">On the "losing" side, the logs look like this:</p><p class="">--------</p>
<p class="">=ERROR REPORT==== 3-Feb-2014::09:48:56 ===</p><p class="">Mnesia(rabbit@goodnessmq2): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, rabbit@goodnessmq1}</p><p class=""><br></p>
<p class="">=INFO REPORT==== 3-Feb-2014::09:48:56 ===</p><p class="">Autoheal request sent to rabbit@goodnessmq1</p><p class=""><br></p><p class="">=WARNING REPORT==== 3-Feb-2014::09:48:56 ===</p><p class="">Federation exchange 'skytap' in vhost '/' did not connect to exchange 'skytap' in vhost '/' on amqp://something <a href="http://else.foo.bar.com:5672">else.foo.bar.com:5672</a></p>
<p class="">{error,unknown_host}</p><p class="">=INFO REPORT==== 3-Feb-2014::09:48:56 ===</p><p class="">Statistics database started.</p><p class=""><br></p><p class="">=WARNING REPORT==== 3-Feb-2014::09:48:58 ===</p><p class="">
Federation exchange 'skytap' in vhost '/' did not connect to exchange 'skytap' in vhost '/' on amqp://<a href="http://somethingelse.foo.bar.com:5672">somethingelse.foo.bar.com:5672</a></p><p class="">
</p><p class="">{error,unknown_host}</p><p class="">--------</p></div></div>
</div></blockquote><blockquote type="cite"><div><span>_______________________________________________</span><br><span>rabbitmq-discuss mailing list</span><br><span><a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a></span><br><span><a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a></span><br></div></blockquote></body></html>