[rabbitmq-discuss] Bug: Java ConnectionFactory can hangs forever in network packet loss cases

Ильдар Нурисламов absorbb at gmail.com
Mon Feb 13 09:28:39 GMT 2012


Can anybody help with this problem or prove that i'm wrong?

2012/2/7 Ильдар Нурисламов <absorbb at gmail.com>

> Hello.
>
> We have rabbitMQ 2.7.1 java clients remotely connected to the server.
> We started experience short-term bad network scenarios and serious problem
> occurred:
> 1. factory.setRequestedHeartbeat set to 30s
> 2. factory.setConnectionTimeout set to 30000ms
> client properly closes connection after missing 30 seconds of heartbeats.
> But sometimes it hangs completely when tries to open a new connection.
>
> I tried to analyze java client code and what is result:
>
> AMQConnection.java:286 :
>           _frameHandler.setTimeout(HANDSHAKE_TIMEOUT); - socket.soTimeout
> is set to 10s here
> then it starts the MainLoop at line 294
> and blocks till get a reply for a handshake at line 300:
>      connStart =
>                 (AMQP.Connection.Start)
> connStartBlocker.getReply().getMethod();
>
> problem is that it's possible that it'll never get a reply. Because
> MainLoop relies on heartbeats functional to handle such situation which is
> not enabled yet. It happens only at line 368:
>             setHeartbeat(heartbeat);
> MainLoop endlessly runs at 492:
>     Frame frame = _frameHandler.readFrame();
> which returns null every 10s (this is how SocketTimeoutException handled
> in Frame.readFrom..)
> and handleSocketTimeout() do nothing because _heartbeat is not set yet.
>
> Thanks.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120213/bd1519e0/attachment.htm>


More information about the rabbitmq-discuss mailing list