[rabbitmq-discuss] auto-delete queue/exchanges and cluster synchronization

Tue May 13 14:25:27 BST 2014

Hi all,

I would like to start by apologizing for the long email but i have quiet a
few questions that i need to clarify :)

So about my problem, i would like to report a miss-behavior of the RabbitMQ
cluster (RabbitMQ 3.1.5, Erlang R14B04) that we come to notice in our
OpenStack cluster, that use RabbitMQ for message delivery. Understanding
OpenStack is not required here but i will try to abstract what is that as
much as i can.

Basically we have 3 nodes RabbitMQ cluster and a bunch OpenStack services
that use RabbitMQ to communicate between them, one of the mean of
communication is RPC, which is done by creating exchanges, queues with
auto-delete flag set, which are the one that exhibit the problem.

So usually all OpenStack services start using one node of the cluster and
when this node go down they connect to another node and make sure that all
exchanges and queues are re-created there, usually the re-creation part end
up being a no-op b/c of the cluster synchronization (Queues are also
created with x-ha-policy set to all).

But in the same time if a node of the cluster go down the Queue consumer
that are created in this node will be deleted, the Queues with auto-delete
will end up being deleted too and the same thing with the exchanges bounded
to them which also have auto-delete, all of this will be done
**eventually** in all RabbitMQ nodes that are still up.

So in detail from neutron side we have:

N1. Connect to node 2.
N2. Create Exchange X.
N3. Create Queue Q.
N4. Create Binding from Q to X.

>From cluster side we have:

R1. Delete consumer.
R2. Delete Queue Q (Binding is deleted explicitly).
R3. Delete Exchange X.

This actually can lead to a race condition that will result of N4 failing
with error stating that exchange doesn't exist because apparently R3 action
was executed after N2 and before N4.

A workaround that we have created is to retry creation if it fail as you
can see in the bug report in OpenStack side
https://bugs.launchpad.net/neutron/+bug/1318721.

But i think that this also a problem in RabbitMQ side, basically i believe
a more sane behavior will be for RabbitMQ to ignore **old** delete if a
newest exchange's declare was sent, instead of threading this later as a no
op.

Does my analyze above make sense !?

When reading also the AMQP 0-9-1 reference (
https://www.rabbitmq.com/amqp-0-9-1-reference.html) i found that:

   - The server SHOULD allow for a reasonable delay between the point when
   it determines that an exchange is not being used (or no longer used), and
   the point when it deletes the exchange. At the least it must allow a client
   to create an exchange and then bind a queue to it, with a small but
   non-zero delay between these two actions.

Does my finding contradict this ? And how big is the delay ?

One last thing is the auto-delete deprecation from
http://www.rabbitmq.com/amqp-0-9-1-errata.html, point 25:

   The 'auto-delete' flag on 'exchange.declare' got deprecated in 0-9-1.
Auto-delete exchanges are actually quite useful, so this flag should be
restored.

Does this mean the auto-delete flag will not be removed from RabbitMQ or
what ?

Thank you for you time and hope this was helpful

--
Mouad Benchchaoui
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140513/0272d3c5/attachment.html>