[rabbitmq-discuss] RabbitMQ server closing connection every few minutes
Tim Watson
tim at rabbitmq.com
Tue May 22 08:48:19 BST 2012
On 22/05/2012 07:34, David Tinker wrote:
> Hi
>
> We have a Java application using RabbitMQ that keeps getting its
> connection closed by the server every few minutes. On the client side
> we see this:
>
> 2012-05-22 05:30:05,337 [AMQP Connection 127.0.0.1:5671] ERROR
> pork.RabbitService - Connection to Rabbit Exchange shutting down:
> com.rabbitmq.client.ShutdownSignalException: connection error; reason:
> java.net.SocketException: Connection reset
> Reason: java.net.SocketException: Connection reset
>
> On the server side we get this:
>
> =INFO REPORT==== 22-May-2012::05:25:20 ===
> accepting AMQP connection<0.32660.612> (127.0.0.1:45458 -> 127.0.0.1:5671)
>
> =ERROR REPORT==== 22-May-2012::05:30:05 ===
> closing AMQP connection<0.32660.612> (127.0.0.1:45458 -> 127.0.0.1:5671):
> {timeout,running}
>
Hi David. The second error message will appear in the logs when there
has been an un-handled exception in rabbit's 'reader process'. The
'{timeout, running}' state occurs when receiving data from the network
takes too long (which is where the 'timeout' tag originates) but the
AMQP connection state is 'running' - the second tag. When this condition
arises, the broker's reader process shuts down the connection.
So at first glance, it does look like a networking problem to me,
although TBH I'm not sure about the nature of it just yet. I suspect it
could have something to do with "... It can sometimes be several minutes
before a message is acked or nacked ...".
> It happens more frequently when the system is busy (500-1000
> messages/s) than when it is cruising (50 messages/s). The application
> and RabbitMQ are on the same box so I don't think its a
> networking/firewall problem.
>
> The application only uses one connection and one channel. Messages are
> received using Connection.basicConsume(..) and acked or nacked
> sometime later on different threads. It can sometimes be several
> minutes before a message is acked or nacked. We are using
> Connection.basicQos(10000) to limit the number of outstanding messages
> but its not often that reaches 10000 and the error happens anytime.
> There is defensive code to make sure messages are not acked/nacked
> more than once.
>
> Anyone have any ideas? Is there any way to get more information on the
> server or client side regarding the error?
>
Well the server error seems pretty clear and just to be 100% sure, I can
verify that the error message you're seeing is only generated in one
place in the whole code base:
https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_reader.erl#L241.
The code handling the timeout condition is
https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_reader.erl#L321,
the call site for which is
https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_reader.erl#L284.
The origin of this is
https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_net.erl#L108
in rabbit_net - what the code does is make the calling process wait for
a message from the Erlang networking layer. So it is the underlying
TCP/IP layer in Erlang which is noticing a timeout on the connection.
How this is mediated through the client's behaviour is another matter.
Hope this helps us get started figuring out what's wrong!
Cheers,
Tim
> RabbitMQ 2.8.1 on Gentoo Linux 64bit
> Erlang R15B (erts-5.9) [source] [64-bit]
> amqp-client-2.8.2 (also tried amqp-client-2.7.1)
> java version "1.6.0_31"
> Java(TM) SE Runtime Environment (build 1.6.0_31-b04)
> Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01, mixed mode)
>
> Thanks
> David
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
More information about the rabbitmq-discuss
mailing list