[rabbitmq-discuss] Rabbitmq 2.5-2.6.1 java hanging on close of connection

Benjamin Bennett benbennett at gmail.com
Mon Nov 21 16:10:56 GMT 2011


Our application is using the CachingConnectionFactory, Spring
Integration AMQP .
The way this is happening is just by applying the @DirtiesContext in
our unit test , that we need to reload the context, at that time the
channel close and connection close are closed in different threads.
The BlockingQueueConsumer is the one that end ups doing the channel
closes but the CachingConnectionFactory actually  does the closes .


The stacktraces have no other threads running except spring-amqp
threads BlockingQueueConsumer and CachingConnectionFactory.

My only comment about attracting attention it is a race condition ,
and race conditions are the most painful bugs to have to deal with.

On my dev box I have only seen the issue one time in 200 runs that ran
over night.
On our test windows 7 vms it happens 50% of the time, sample size was 40 runs.
On our windows xp test vms it has never happened.

In our production we get a "The service failed to close in timely
manner" and doesn't shut down.

Finally if you are fixing it in the server , what about the previous
versions? I don't even think spring-amqp has migrated to 2.7 series
java client because it is backwards incompatible.

Thanks,
Benjamin Bennett

On Mon, Nov 21, 2011 at 7:07 AM, Simon MacMullen <simon at rabbitmq.com> wrote:
> Hi Benjamin.
>
> In general we do accept patches, although due to our Corporate Overlords you
> would have to sign a contributor agreement.
>
> I would be reluctant to merge such a patch though, since:
>
> * It's to work around a server bug which will be fixed in the next release
>
> * This bug has been around for most of a year without attracting much
> attention
>
> * Option 1) is ugly; option 2) is (somewhat) complicated.
>
> BTW, I got a reply from the spring-amqp maintainer, Dave Syer:
>
>> Spring AMQP doesn't explicitly invoke close() in different threads.
>> There's nothing to stop it happening (as is the case with the Java
>> client itself I suppose), but we actually hardly ever call
>> Channel.close() so it is pretty unlikely.  I would be interested to
>> hear of a way to tickle a normal app into this behaviour.
>
> So I still wonder if the threading thing is something you are doing in your
> app.
>
> Cheers, Simon
>
> On 17/11/11 18:12, Benjamin Bennett wrote:
>>
>> Had a question if no one is working on patch for the java client do
>> accept external patches?
>>
>> I was going to either change it to the following .
>>
>> 1) Remove the infinite wait on a connection close.
>>
>> Or
>>
>> Place a BlockingQueue on the channel closes during the close call.
>>
>> The sychronize on the BlockingQueue for the connection close  , of
>> which it cannot close the connection if a channel is currently be
>> closed.
>>
>> The second is more code but it would keep the infinity timeout in
>> place will working around the deadlock issue.
>>
>> It would save a lot of pain if people are using 2.7.1 and below.
>>
>> On Mon, Nov 14, 2011 at 5:21 PM, Benjamin Bennett<benbennett at gmail.com>
>>  wrote:
>>>
>>> I am using the spring amqp lib and it is doing the connection closing
>>> when
>>> the spring context is closed. I do not think it has a property to inject
>>> the
>>> hack. Also if you know any of the spring amqp devs. Having you telling
>>> them
>>> to check to make sure it is doing the way you have described will have
>>> much
>>> more authority than me.
>>>
>>> I will probably hack the spring amqp lib for now
>>>
>>> On Nov 14, 2011 12:32 PM, "Simon MacMullen"<simon at rabbitmq.com>  wrote:
>>>>
>>>> On 14/11/11 16:15, Benjamin Bennett wrote:
>>>>>
>>>>> Here is report from  rabbitmqctrl report
>>>>> http://pastebin.com/MSwv82C3
>>>>
>>>> Ah, thank you. After some poking, that genuinely looks like a server
>>>> bug.
>>>> Damn.
>>>>
>>>> In order for it to happen you need the last channel close / close_ok to
>>>> overlap with the connection close / close_ok. With the Java client you
>>>> have
>>>> to invoke Channel.close() and Connection.close() from different threads
>>>> to
>>>> get this to happen, and still be unlucky.
>>>>
>>>> You should be allowed to do this, but right now it's racy.
>>>>
>>>>> I was going to attempt to put a timeout on the connection close method
>>>>> but that really would be a hack.
>>>>
>>>> Indeed! Other slightly less hacky workarounds until we get this fixed:
>>>>
>>>> * Invoke Channel.close() and Connection.close() from the same thread, or
>>>> otherwise ensure they don't overlap.
>>>>
>>>> * Don't invoke Channel.close() if you know you're going to invoke
>>>> Connection.close() anyway.
>>>>
>>>> Cheers, Simon
>>>>
>>>> --
>>>> Simon MacMullen
>>>> RabbitMQ, VMware
>>>> _______________________________________________
>>>> rabbitmq-discuss mailing list
>>>> rabbitmq-discuss at lists.rabbitmq.com
>>>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>>>
>
>
> --
> Simon MacMullen
> RabbitMQ, VMware
>


More information about the rabbitmq-discuss mailing list