<div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">From: Matthew Sackman &lt;<a href="mailto:matthew@rabbitmq.com">matthew@rabbitmq.com</a>&gt;<br>

<br>

Yeah, this is a bit of a problem. Essentially, yes, there is a timeout,<br>

after which Rabbit will just give up trying to start. There are<br>

definitely ways in which we could embue rabbitmqctl with the means to<br>

tell rabbit to abandon all hope of some remote node ever rejoining (and<br>

indeed, even ways to do this without the local rabbit coming up, which<br>

is essential in this case), but we&#39;ve not yet written this. There is a<br>

bug open for this.<br>

<br>

After that, the various local mnesia databases will try to merge<br>

themselves back together, but if two nodes disagree about various<br>

details and they can&#39;t delegate to a common node that was alive at the<br>

point they failed, then mnesia will give up with an &quot;unable to merge<br>

schema&quot; error, and there&#39;ll really be no hope at getting them both to<br>

merge back together without resetting one of those nodes.<br>

<br>

Erm possibly. Much of this ordering stuff comes from mnesia rather than<br>

Rabbit. If the node&#39;s mnesia is happy to start up then Rabbit will then<br>

restore queues without further delay. That may include the promotion of<br>

slaves to master.<br></blockquote><div><br></div><div>Matthew,</div><div><br></div><div>Thanks for your help understanding the failure modes.</div><div><br></div><div>To summarize, if I understand correctly, in the event of a cluster failure, if the master of a mirrored queue fails to recover, after some timeout the remaining slaves may or may not choose a new master and recover.  Whether they do so will depend on whether the underlaying Mnesia tables converge or not.  And if they do not converge, you may well have to reset one or more nodes, thus discarding any persisted messages in them.  In that case, you better choose wisely, and reset the nodes most out of sync with the lost master.</div>

<div><br></div><div>There is also currently no way to tell the cluster a node will not be rejoining, and thus avoiding waiting for the timeout, but there is an open ticket for this.</div><div><br></div><div>BTW, how long is this timeout?  Is it configurable?</div>

<div><br></div><div>So would it be fair to say that mirrored queues are fault tolerant to node loss, but not necessarily to cluster loss?</div><div><br></div><div>Elias Levy</div></div>