&gt; Pause all publishing before (re)starting any cluster nodes.<br>Just want to report back that the &quot;work around&quot; did the trick :-) Of course the situation is not ideal, but we have a working cluster again<br><br>

Thx Matthias! <br><br>Cheers<br>Matthias<br><br><div class="gmail_quote">On Mon, Aug 27, 2012 at 10:44 PM, Matthias Reik <span dir="ltr">&lt;<a href="mailto:matthias.reik@gmail.com" target="_blank">matthias.reik@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">See comments inline<br><br>Thanks<br>Matthias<br><br><div class="gmail_quote"><div class="im">On Mon, Aug 27, 2012 at 5:22 PM, Matthias Radestock <span dir="ltr">&lt;<a href="mailto:matthias@rabbitmq.com" target="_blank">matthias@rabbitmq.com</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Matthias,<div><br>

<br>

On 27/08/12 16:02, Matthias Reik wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

even though the setup looks slightly differently (since we are not<br>

using the shovel plugin), the reason could be the same. We are<br>

explicitly ACKing the messages (i.e. no auto-ack), even though the<br>

consumers are in the same data-centers (so we should have a reliable<br>

network), but if the acks are lost and that causes memory increase in<br>

the server then it could be the same bug.<br>

</blockquote>

<br></div>

As noted in my analysis, the bug has nothing do with the shovel, or consuming/acking - simply publishing to HA queues when (re)starting slaves is sufficient to trigger it. </blockquote></div><div>Wasn&#39;t sure I understood it 100% correctly (sorry not too experienced with RabbitMQ yet). Thx for the confirmation.<br>


 </div><div class="im"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Is there anything I could do to validate this assumption?<br>

</blockquote>

<br></div>

I don&#39;t think it&#39;s worth the hassle. I am quite certain that you are suffering from the same bug.</blockquote></div><div> OK, if you expect a fix for the issue to appear soon then I could wait with with &quot;fixing&quot; the cluster and try out any updated version. If it will take more time, then I will probably go for your (below) suggested fix/workaround.<br>


<br></div><div class="im"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Is there anything I can do in the meantime to get into a state where I<br>

have a working cluster again<br>

</blockquote>

<br></div>

Pause all publishing before (re)starting any cluster nodes.<br></blockquote></div><div>Yes, that makes sense.<br><br>Thank you for your quick response. <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<br>

Regards,<br>

<br>

Matthias.<br>

</blockquote></div><br>

</blockquote></div><br>