[rabbitmq-discuss] RabbitMQ blocking issue

Matthias Radestock matthias at rabbitmq.com
Thu Feb 14 08:21:18 GMT 2013


James,

On 13/02/13 23:40, Carroll James (Nokia-LC/Malvern) wrote:
> Yup. The times are about right. See the connections from 182. The
> first email show port 49191 which was opened at 15:17 GMT. It was
> hung for quite a while before we got to it but the server closed it
> at 21:22 and opened a new one. That's around the time we were in
> there.
>
> If it matters the log is attached.

Here's the relevant portion:

<quote>
=INFO REPORT==== 13-Feb-2013::15:17:00 ===
accepting AMQP connection <0.2252.1282> (10.196.42.182:49191 -> 
10.196.42.21:5672)

=ERROR REPORT==== 13-Feb-2013::21:22:01 ===
closing AMQP connection <0.2252.1282> (10.196.42.182:49191 -> 
10.196.42.21:5672):
{heartbeat_timeout,running}
</quote>

So the connection was open for quite a while and then got closed due to 
a missed heartbeat.

The default heartbeat interval is 10 minutes, but the server will only 
time out between two and three intervals after the last data has been 
received, so it could take up to half an hour for the server to kill a 
connection that has gone bad.

The question is why the server isn't seeing any data. There are two 
obvious explanations:
a) there is a network disruption, and
b) the server has stopped reading from the socket

The presence of a non-empty Recv-Q on the server-side connection points 
to the latter. We need to get hold of a 'rabbitmqctl report' showing 
that connection while the Recv-Q is non-empty, so we know what state the 
server thought the connection was in at the time.

Regards,

Matthias.


More information about the rabbitmq-discuss mailing list