[rabbitmq-discuss] Bug: Java ConnectionFactory can hangs forever in network packet loss cases

Steve Powell steve at rabbitmq.com
Tue Feb 21 10:50:35 GMT 2012


Ildar,

Just a note to confirm that this bug is now fixed and will be in the next release.

Steve Powell  (a funny bunny)
----------some more definitions from the SPD----------
vermin (v.) Treating the dachshund for roundworm.
chinchilla (n.) Cooling device for the lower jaw.
socialcast (n.) Someone to whom everyone is speaking but nobody likes.

On 16 Feb 2012, at 13:27, Steve Powell wrote:

> Dear Ildar,
> 
> First, may I apologise for not getting back to you sooner. It seems that you
> have clearly identified a bug, and have helped to narrow it down for us.
> 
> Thank you very much. I have raised a problem for us to fix and track this
> (24747).
> 
> I have a few comments regarding your settings: it seems to me that a heartbeat
> of 30s is not unreasonable, but you should be aware that anything up to a minute
> may pass before noticing that a heartbeat is missed, so you must not rely on
> this interval.
> 
> The ConnectionTimeout will only affect waiting for the socket connection so is
> not involved in this. I think your interval here is again quite large, but not
> unreasonable in unreliable networks. I would expect the herartbeat to be about
> half of this (see note above).
> 
> We'll get on to this bug asap.
> Steve Powell
> steve at rabbitmq.com
> [wrk: +44-2380-111-528] [mob: +44-7815-838-558]
> 
> On 13 Feb 2012, at 09:28, Ильдар Нурисламов wrote:
> 
>> Can anybody help with this problem or prove that i'm wrong?
>> 
>> 2012/2/7 Ильдар Нурисламов <absorbb at gmail.com>
>> Hello.
>> 
>> We have rabbitMQ 2.7.1 java clients remotely connected to the server.
>> We started experience short-term bad network scenarios and serious problem occurred:
>> 1. factory.setRequestedHeartbeat set to 30s
>> 2. factory.setConnectionTimeout set to 30000ms
>> client properly closes connection after missing 30 seconds of heartbeats.
>> But sometimes it hangs completely when tries to open a new connection.
>> 
>> I tried to analyze java client code and what is result:
>> 
>> AMQConnection.java:286 :
>>          _frameHandler.setTimeout(HANDSHAKE_TIMEOUT); - socket.soTimeout is set to 10s here
>> then it starts the MainLoop at line 294
>> and blocks till get a reply for a handshake at line 300:
>>     connStart =
>>                (AMQP.Connection.Start) connStartBlocker.getReply().getMethod();
>> 
>> problem is that it's possible that it'll never get a reply. Because MainLoop relies on heartbeats functional to handle such situation which is not enabled yet. It happens only at line 368:
>>            setHeartbeat(heartbeat);
>> MainLoop endlessly runs at 492:              
>>    Frame frame = _frameHandler.readFrame();
>> which returns null every 10s (this is how SocketTimeoutException handled in Frame.readFrom..)
>> and handleSocketTimeout() do nothing because _heartbeat is not set yet.
>> 
>> Thanks.
>> 
>> _______________________________________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.rabbitmq.com
>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
> 
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss



More information about the rabbitmq-discuss mailing list