[rabbitmq-discuss] RabbitMQ server closing connection every few minutes

Tim Watson tim at rabbitmq.com
Tue May 22 08:48:19 BST 2012


On 22/05/2012 07:34, David Tinker wrote:
> Hi
>
> We have a Java application using RabbitMQ that keeps getting its
> connection closed by the server every few minutes. On the client side
> we see this:
>
> 2012-05-22 05:30:05,337 [AMQP Connection 127.0.0.1:5671] ERROR
> pork.RabbitService  - Connection to Rabbit Exchange shutting down:
> com.rabbitmq.client.ShutdownSignalException: connection error; reason:
> java.net.SocketException: Connection reset
> Reason: java.net.SocketException: Connection reset
>
> On the server side we get this:
>
> =INFO REPORT==== 22-May-2012::05:25:20 ===
> accepting AMQP connection<0.32660.612>  (127.0.0.1:45458 ->  127.0.0.1:5671)
>
> =ERROR REPORT==== 22-May-2012::05:30:05 ===
> closing AMQP connection<0.32660.612>  (127.0.0.1:45458 ->  127.0.0.1:5671):
> {timeout,running}
>

Hi David. The second error message will appear in the logs when there 
has been an un-handled exception in rabbit's 'reader process'. The 
'{timeout, running}' state occurs when receiving data from the network 
takes too long (which is where the 'timeout' tag originates) but the 
AMQP connection state is 'running' - the second tag. When this condition 
arises, the broker's reader process shuts down the connection.

So at first glance, it does look like a networking problem to me, 
although TBH I'm not sure about the nature of it just yet. I suspect it 
could have something to do with "... It can sometimes be several minutes 
before a message is acked or nacked ...".

> It happens more frequently when the system is busy (500-1000
> messages/s) than when it is cruising (50 messages/s). The application
> and RabbitMQ are on the same box so I don't think its a
> networking/firewall problem.
>
> The application only uses one connection and one channel. Messages are
> received using Connection.basicConsume(..) and acked or nacked
> sometime later on different threads. It can sometimes be several
> minutes before a message is acked or nacked. We are using
> Connection.basicQos(10000) to limit the number of outstanding messages
> but its not often that reaches 10000 and the error happens anytime.
> There is defensive code to make sure messages are not acked/nacked
> more than once.
>
> Anyone have any ideas? Is there any way to get more information on the
> server or client side regarding the error?
>

Well the server error seems pretty clear and just to be 100% sure, I can 
verify that the error message you're seeing is only generated in one 
place in the whole code base: 
https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_reader.erl#L241.

The code handling the timeout condition is 
https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_reader.erl#L321, 
the call site for which is 
https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_reader.erl#L284. 
The origin of this is 
https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_net.erl#L108 
in rabbit_net - what the code does is make the calling process wait for 
a message from the Erlang networking layer. So it is the underlying 
TCP/IP layer in Erlang which is noticing a timeout on the connection.

How this is mediated through the client's behaviour is another matter.

Hope this helps us get started figuring out what's wrong!

Cheers,
Tim

> RabbitMQ 2.8.1 on Gentoo Linux 64bit
> Erlang R15B (erts-5.9) [source] [64-bit]
> amqp-client-2.8.2 (also tried amqp-client-2.7.1)
> java version "1.6.0_31"
> Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
>
> Thanks
> David
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss



More information about the rabbitmq-discuss mailing list