<span style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">Hi Matthias,</span><div class="im" style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
<div><br></div><div><font color="#3366ff">What's the general health of your rabbit when this happens?</font></div></div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
The RabbitMQ nodes seems fine during the incident happened:</div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><br></div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
<b><u>Master Node Usage</u></b></div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">Memory : 300M</div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
CPU % : Average 30%</div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><br></div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
<u><b>Slave Node Usage</b></u></div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">Memory : 130M</div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
CPU % : Average 30%</div><div class="im" style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><div><br></div><div><font color="#3366ff">Plenty of free memory and disk?</font></div>
</div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">Yes, there are plenty of free memory and disk space in both servers.</div><div class="im" style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
<div><br></div><div><font color="#3366ff">Are you using mirrored/HA queues at all?</font></div></div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">Yes, I'm using HA in all queues.</div>
<div class="im" style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><div><br></div><div><font color="#3366ff">Could there possibly be a spike, i.e. lots of messages getting published in a short space of time?</font></div>
</div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">No, I checked there are only few messages during that time, e.g. 2012-09-26 03:28:32 => 4 messages</div>
<div class="im" style="color:rgb(80,0,80);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><div><br></div><div><font color="#3366ff">No strange errors in the logs? </font></div></div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
After the incident happened RabbitMQ did logged warning, e.g.</div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><br></div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
<b><u>RabbitMQ Log</u></b></div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><b><u><br></u></b></div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
<div>=WARNING REPORT==== 26-Sep-2012::03:29:25 ===</div><div>closing AMQP connection <0.8661.18> (<a href="http://192.168.0.100:46151/" target="_blank" style="color:rgb(17,85,204)">192.168.0.100:46151</a> -> <a href="http://192.168.0.100:5672/" target="_blank" style="color:rgb(17,85,204)">192.168.0.100:5672</a>):</div>
<div>connection_closed_abruptly</div><div><br></div><div>=INFO REPORT==== 26-Sep-2012::03:29:37 ===</div><div>accepting AMQP connection <0.8897.18> (<a href="http://192.168.0.100:43836/" target="_blank" style="color:rgb(17,85,204)">192.168.0.100:43836</a> -> <a href="http://192.168.0.100:5672/" target="_blank" style="color:rgb(17,85,204)">192.168.0.100:5672</a>)</div>
</div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)"><br></div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
<br></div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">Regards,</div><div style="color:rgb(34,34,34);font-family:arial,sans-serif;font-size:13px;background-color:rgb(255,255,255)">
Wong</div><br><div class="gmail_quote">On Tue, Sep 25, 2012 at 2:32 PM, Matthias Radestock <span dir="ltr"><<a href="mailto:matthias@rabbitmq.com" target="_blank">matthias@rabbitmq.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Wong,<div class="im"><br>
<br>
On 25/09/12 03:46, Wong Kam Hoong wrote:<br>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
com.rabbitmq.client.<u></u>ShutdownSignalException: clean channel shutdown;<br>
reason: #method<channel.close>(reply-<u></u>code=406, reply-text=TIMEOUT<br>
WAITING FOR ACK, class-id=0, method-id=0)<br>
at<br>
com.rabbitmq.client.impl.<u></u>ChannelN.waitForConfirms(<u></u>ChannelN.java:182)<br>
<br></div>
Based on the log, the problem seems like related to "*NACKS Received*"<br>
and "*ChannelN*" code throw the TimeoutException with reply-code *406<br>
*(PRECONDITION_FAILED).<br>
</blockquote>
<br>
I don't think it has anything to do with nacks - unless you are actually seeing some exception mentioning nacks.<br>
<br>
Looks like a straightforward timeout.<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
I check all RabbitMQ servers are still working fine and I believe the<br></div>
time I set for *waitForConfirmsOrDie *(10000) should be sufficient enough.<br>
</blockquote>
<br>
10s should indeed generally be long enough. But one can certainly envision scenarios where it won't be. What's the general health of your rabbit when this happens? Plenty of free memory and disk? No strange errors in the logs? Are you using mirrored/HA queues at all? Could there possibly be a spike, i.e. lots of messages getting published in a short space of time?<br>
<br>
<br>
Regards,<br>
<br>
Matthias.<br>
</blockquote></div><br>