[rabbitmq-discuss] Queue delete causes transaction errors

Tue Jul 13 16:29:11 BST 2010

Hi Aaron,

> I've confirmed that transactions are being aborted due to a queue
> being deleted using an easily-reproducible test case.
> 
> The base-case is an exchange and a 1-1 mapping of a routing key to a
> destination queue.  The client is simply consuming a published message
> and then publishing the next one.  A single client as a consumer is
> enough to repeat the bug.
> 
> We then have a test client which we can use to attach to an exchange
> using a supplied routing key.  It will create its own queue and then
> act as a passive listener, for easy monitoring of traffic.  The queue
> it uses is set to auto-delete. In high-traffic situations, I would
> occasionally see a transaction error in the client.
> 
> I setup a test case today where a listener would open a connection and
> queue, listen for 2 seconds, then disconnect.  I tried combinations of
> auto_delete enabled and disabled, both with and without an explicit
> queue delete call, as well as using transactions and re-using a
> connection versus closing and reconnecting.  I would run this test
> listener while a test client simply published a message to itself
> every time it received one.  The client is using transactions,
> committing after each publish call.  Within a few minutes, no matter
> how my listener was configured, the client would receive a transaction
> error.  I repeated this with the 1.8.0 release.

It looks like there's two races going on here:
  - the queue being autodeleted, and the transaction committing; and
  - the connection dropping, and the transaction committing.

In the first case, the transaction commit fails because the queue has 
gone away and it can no longer route the message to it.  I'm less 
certain about the second; I think it may be because the queue tries to 
deliver the message on tx.commit, and the connection drops while that's 
happening.

The AMQP spec doesn't say a lot about the properties of transactions, 
and in particular, whether routing "happens" before or after tx.commit. 
  RabbitMQ routes before the tx.commit, mainly so that persistent 
messages will land on disk.

It would be well within the spec to *act* as though routing happened 
after tx.commit; e.g., the transaction wouldn't fail because your 
autodelete queue has gone away.  We'd also have to be careful of the 
second case, that failing to deliver the message didn't cause the 
tx.commit to fail.  That's probably a more useful semantics overall, anyway.

(We actually already have a bug for looking into this -- thanks for 
bringing back to our attention!)

> I see that 0.9.1 of the spec adds queue.unbind().  Is that the only
> way to avoid this problem, or is there another approach that we can
> take?

I don't see how queue.unbind would help -- would you explain?

Cheers,
Michael