<div dir="ltr">I can reproduce this very reliably. <div><br></div><div>How can I run an instrumented version?</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Aug 7, 2013 at 5:32 AM, Emile Joubert <span dir="ltr">&lt;<a href="mailto:emile@rabbitmq.com" target="_blank">emile@rabbitmq.com</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 06/08/13 17:53, Yamil Einar Asusta Santos wrote:<br>

<br>

&gt; I have put the log in<br>

&gt; here: <a href="https://gist.github.com/elbuo8/e5171ec85608b7bd7842" target="_blank">https://gist.github.com/elbuo8/e5171ec85608b7bd7842</a><br>

<br>

The error recorded on 6-Aug-2013::16:29:44 looks like it might be part<br>

of the problem. Can you tell by looking at the other logfiles whether<br>

anything noteworthy happened at that time? Is anything recorded in the<br>

sasl log?<br>

<br>

Are you able to reproduce this issue reliably? And does the last<br>

remaining node always have an log entry containing<br>

&quot;{rabbit_node_monitor,handle_info,2}&quot; ?<br>

<br>

Would it be possible for you to run an instrumented version of the<br>

broker to help discover the cause of the problem? It will help alot if<br>

you can reproduce the problem with instrumented code in a test environment.<br>

<br>

You may wish to configure &quot;cluster_partition_handling&quot; to &quot;ignore&quot; as a<br>

temporary workaround. This should allow the last node to keep running<br>

while the others are off-line.<br>

<span class="HOEnZb"><font color="#888888"><br>

<br>

<br>

<br>

-Emile<br>

<br>

<br>

<br>

<br>

</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br><span style="background-color:rgb(255,255,255)"><span style="background-repeat:repeat repeat">Yamil</span> <span style="background-repeat:repeat repeat">Asusta</span><br>


<br></span>

</div>