<div dir="ltr">This is rabbitmq 3.2.4, running in a 2 node cluster with all queues in ha.<div><br></div><div>There was a queue called cmcmd declared like this:</div><div><br><div><table class="" style="border-collapse:collapse;font-family:Verdana,sans-serif">
<tbody><tr><th style="font-weight:normal;color:black;font-size:12px;line-height:17px;padding:0px 2px 2px;text-align:right;border:none;vertical-align:top">arguments:</th><td style="font-size:12px;line-height:17px;font-family:Verdana,sans-serif;padding:0px 2px 2px;vertical-align:top;border:none">
<table class="" style="border-collapse:collapse"><tbody><tr><th style="font-weight:normal;color:black;font-size:12px;padding:0px 2px 2px;text-align:right;border:none;vertical-align:top">x-ha-policy:</th><td style="font-size:12px;font-family:Verdana,sans-serif;padding:0px 2px 2px;vertical-align:top;border:none">
<acronym class="" title="string" style="background-image:none;color:inherit;padding:0px;border-top-left-radius:2px;border-top-right-radius:2px;border-bottom-right-radius:2px;border-bottom-left-radius:2px;border-style:none none dotted;border-bottom-width:1px;border-bottom-color:rgb(221,221,221)">all</acronym></td>
</tr></tbody></table></td></tr><tr><th style="font-weight:normal;color:black;font-size:12px;line-height:17px;padding:0px 2px 2px;text-align:right;border:none;vertical-align:top">durable:</th><td style="font-size:12px;line-height:17px;font-family:Verdana,sans-serif;padding:0px 2px 2px;vertical-align:top;border:none">
<acronym class="" title="boolean" style="background-image:none;color:inherit;padding:0px;border-top-left-radius:2px;border-top-right-radius:2px;border-bottom-right-radius:2px;border-bottom-left-radius:2px;border-style:none none dotted;border-bottom-width:1px;border-bottom-color:rgb(221,221,221)">true</acronym></td>
</tr></tbody></table><br></div></div><div>At some point we saw a network partition (see below). It appears that Autoheal eventually worked, but afterwards the cmcmd queue wasn't on the broker.</div><div><br></div><div>
Here's the Autoheal sequence (note the big time gap waiting for the sea5m1mq1 shutdown):</div><div><br></div><div>-----------</div><div>







<p class=""><span class="">21:57</span> <span class="">mpietrek@foo</span>:/$ grep heal 2014-04-14-mq.log </p>
<p class="">2014-04-14 18:02:35 sea5m1mq2 [info] rabbit@sea5m1mq2.log: Auto<span class=""><b>heal</b></span> request sent to rabbit@sea5m1mq1</p>
<p class="">2014-04-14 18:02:35 sea5m1mq2 [info] rabbit@sea5m1mq2.log: Auto<span class=""><b>heal</b></span>: I am the winner, waiting for [rabbit@sea5m1mq1] to stop</p>
<p class="">2014-04-14 18:02:35 sea5m1mq2 [info] rabbit@sea5m1mq2.log: Auto<span class=""><b>heal</b></span>: final node has stopped, starting...</p>
<p class="">2014-04-14 18:57:38 sea5m1mq1 [info] rabbit@sea5m1mq1.log: Auto<span class=""><b>heal</b></span> request received from rabbit@sea5m1mq2</p>
<p class="">2014-04-14 18:57:38 sea5m1mq1 [info] rabbit@sea5m1mq1.log: Auto<span class=""><b>heal</b></span> decision</p>
<p class="">2014-04-14 18:57:38 sea5m1mq1 [info] rabbit@sea5m1mq1.log: Auto<span class=""><b>heal</b></span>: we were selected to restart; winner is rabbit@sea5m1mq2</p></div><div>------------</div><div><br></div><div>And the rabbit@sea5m1mq2 log spew:</div>
<div><br></div><div><br></div><div><div>=ERROR REPORT==== 14-Apr-2014::18:02:30 ===</div><div>** Node rabbit@sea5m1mq1 not responding **</div><div>** Removing (timedout) connection **</div><div><br></div><div>=INFO REPORT==== 14-Apr-2014::18:02:30 ===</div>
<div>rabbit on node rabbit@sea5m1mq1 down</div><div><br></div><div>=ERROR REPORT==== 14-Apr-2014::18:02:30 ===</div><div>Mnesia(rabbit@sea5m1mq2): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, rabbit@sea5m1mq1}</div>
<div><br></div><div>=INFO REPORT==== 14-Apr-2014::18:02:30 ===</div><div>Statistics database started.</div><div><br></div><div>=INFO REPORT==== 14-Apr-2014::18:02:30 ===</div><div>Autoheal request sent to rabbit@sea5m1mq1</div>
<div><br></div><div>=ERROR REPORT==== 14-Apr-2014::18:02:30 ===</div><div>** Generic server <0.204.0> terminating</div><div>** Last message in was {mnesia_locker,rabbit@sea5m1mq1,granted}</div><div>** When Server state == {state,2,{from,<0.302.0>,#Ref<0.0.1372.163190>}}</div>
<div>** Reason for termination == </div><div>** {unexpected_info,{mnesia_locker,rabbit@sea5m1mq1,granted}}</div><div><br></div><div>=ERROR REPORT==== 14-Apr-2014::18:02:30 ===</div><div>** Generic server <0.302.0> terminating</div>
<div>** Last message in was {'DOWN',#Ref<0.0.0.2733>,process,<2782.309.0>,</div><div>                               noconnection}</div><div>** When Server state == {state,</div><div>                            {0,<0.302.0>},</div>
<div>                            {{0,<2782.309.0>},#Ref<0.0.0.2733>},</div><div>                            {{0,<2782.309.0>},#Ref<0.0.0.2734>},</div><div>                            {resource,<<"/">>,queue,<<"cmcmd">>},</div>
<div>                            rabbit_mirror_queue_coordinator,</div><div>                            {1,</div><div>                             [{{0,<0.302.0>},</div><div>                               {view_member,</div>
<div>                                   {0,<0.302.0>},</div><div>                                   [],</div><div>                                   {0,<2782.309.0>},</div><div>                                   {0,<2782.309.0>}}},</div>
<div>                              {{0,<2782.309.0>},</div><div>                               {view_member,</div><div>                                   {0,<2782.309.0>},</div><div>                                   [],</div>
<div>                                   {0,<0.302.0>},</div><div>                                   {0,<0.302.0>}}}]},</div><div>                            0,</div><div>                            [{{0,<0.302.0>},{member,{[],[]},0,0}},</div>
<div>                             {{0,<2782.309.0>},{member,{[],[]},0,0}}],</div><div>                            [<0.301.0>],</div><div>                            {[],[]},</div><div>                            [],0,undefined,</div>
<div>                            #Fun<rabbit_misc.execute_mnesia_transaction.1>}</div><div>** Reason for termination == </div><div>** {noproc,{gen_server2,call,</div><div>                        [<0.204.0>,</div>
<div>                         {submit,#Fun<rabbit_misc.6.116010224>,<0.302.0>},</div><div>                         infinity]}}</div><div><br></div><div>=ERROR REPORT==== 14-Apr-2014::18:02:30 ===</div><div>** Generic server <0.203.0> terminating</div>
<div>** Last message in was {mnesia_locker,rabbit@sea5m1mq1,granted}</div><div>** When Server state == {state,1,undefined}</div><div>** Reason for termination == </div></div></div>