[rabbitmq-discuss] auto-delete queue/exchanges and cluster synchronization

Fri May 16 10:42:57 BST 2014

On Tue, May 13, 2014 at 3:59 PM, Simon MacMullen <simon at rabbitmq.com> wrote:

> On 13/05/2014 14:25, mouad ben wrote:
>
>> RabbitMQ cluster (RabbitMQ 3.1.5,Erlang R14B04)
>>
>
> (Queues are also created with x-ha-policy set to all).
>>
>
> As an aside, note that x-ha-policy only controls queue mirroring in 2.x!
> In 3.x it does nothing at all. Seehttp://www.rabbitmq.com/ha.html and
> http://www.rabbitmq.com/blog/2012/11/19/breaking-things-with-rabbitmq-3-0/

Ohh thanks for the info, shamelessly i didn't knew that.

>
>
> Of course, autodelete RPC queues quite possibly do not need to be HA.

Well apparently my "chaos monkey" wasn't able to reproduce this problem
when mirroring queue was enabled !? I am actually still not sure why is
this the case !? I am probably missing something obvious !

>
>
> But in the same time if a node of the cluster go down the Queue consumer
>> that are created in this node will be deleted, the Queues with
>> auto-delete will end up being deleted too and the same thing with the
>> exchanges bounded to them which also have auto-delete, all of this will
>> be done **eventually** in all RabbitMQ nodes that are still up.
>>
>> So in detail from neutron side we have:
>>
>> N1. Connect to node 2.
>> N2. Create Exchange X.
>> N3. Create Queue Q.
>> N4. Create Binding from Q to X.
>>
>>  From cluster side we have:
>>
>> R1. Delete consumer.
>> R2. Delete Queue Q (Binding is deleted explicitly).
>> R3. Delete Exchange X.
>>
>> This actually can lead to a race condition that will result of N4
>> failing with error stating that exchange doesn't exist because
>> apparently R3 action was executed after N2 and before N4.
>>
>
> That makes sense. But note that R2 and R3 are an atomic event.

Just to be sure, you don't mean that (R2+R3) group is an atomic event !?

>
>
> A workaround that we have created is to retry creation if it fail as you
>> can see in the bug report in OpenStack side
>> https://bugs.launchpad.net/neutron/+bug/1318721.
>>
>
> And that would work as a workaround.
>
>
> But i think that this also a problem in RabbitMQ side, basically i
>> believe a more sane behavior will be for RabbitMQ to ignore **old**
>> delete if a newest exchange's declare was sent, instead of threading
>> this later as a no op.
>>
>
> Hmm. Newer than what? The queue delete R2 is not necessarily older than
> then most recent exchange declare N2 in this scenario, is it? In fact,
> isn't R2 always newer than N2?
>

Not necessarily, I don't see why there should be any order between
unrelated operation from client versus cluster side, when the connection is
broken this will raise the sequence of events in both sides but who is
faster ! in the "happy path" the order will be R1 -> R2 -> R3 -> N1 -> N2
-> N3 -> N4, but in reality all orders are possible e.g. R1 -> N1 -> N2 ->
R2 -> N3 -> R3 -> N4, right ? BTW when i say order, i am taking as
reference the RabbitMQ cluster b/c operation are atomics right ?

>
> (Also, doing this would require timestamps in queues, exchanges and
> bindings, updated cluster-wide every time they are re-declared. That would
> be a big problem for performance.)

Agree, timestamp are surely not that easy to manage in a distributed
environment b/c they require the clocks in the different nodes to be in
sync, vector clock may be another solution but well i am not well versed
enough in distributed system to give any real feedback on this, that was
just a dummy idea from my part anyway :)

>
>
> Does my analyze above make sense !?
>>
>
> So I'm not sure it does.
>
> When reading also the AMQP 0-9-1 reference
>> (https://www.rabbitmq.com/amqp-0-9-1-reference.html) i found that:
>>
>>   * The server SHOULD allow for a reasonable delay between the point
>>
>>     when it determines that an exchange is not being used (or no longer
>>     used), and the point when it deletes the exchange. At the least it
>>     must allow a client to create an exchange and then bind a queue to
>>     it, with a small but non-zero delay between these two actions.
>>
>>
>> Does my finding contradict this ? And how big is the delay ?
>>
>
> It does; there is no delay in RabbitMQ.
>
> Note that if there were a delay then you'd still have exactly the same
> race anyway, it wouldn't solve anything.
>
> You can get a similar delay for the queue deletion, by using x-expires
> instead of auto-delete. But once the queue goes, the bindings and exchange
> all go atomically.
>
> So how to solve your problem? It sounds like each thing that can receive
> RPC replies declares an exchange, queue and binding for that purpiose, with
> the exchange and queue not linked in to anything else. Is that correct?
>

Yes that's the main idea of how OpenStack implement RPC.

>
> If that is so, why not only declare a queue, and have the RPC server
> publish to that queue directly? The queue could still be auto-delete, and
> everything would be simpler. Or is there some subtlety I'm not getting?

You mean by publishing to the "nameless exchanger", that sounds much
easier, but i am not why that is not the case i need probably to ask in the
OpenStack mailing list first.

>
>
> One last thing is the auto-delete deprecation from
>> http://www.rabbitmq.com/amqp-0-9-1-errata.html, point 25:
>>
>> The 'auto-delete' flag on 'exchange.declare' got deprecated in 0-9-1.
>> Auto-delete exchanges are actually quite useful, so this flag should be
>> restored.
>>
>> Does this mean the auto-delete flag will not be removed from RabbitMQ or
>> what ?
>>
>
> It means we will continue to support it, yes.
>
> Cheers, Simon
>

Thanks for the quick answer too, and sorry for answering you directly
instead of answering to the mailing list, dummy mistake from my part :(

On Tue, May 13, 2014 at 3:59 PM, Simon MacMullen <simon at rabbitmq.com> wrote:

> On 13/05/2014 14:25, mouad ben wrote:
>
>> RabbitMQ cluster (RabbitMQ 3.1.5,Erlang R14B04)
>>
>
>  (Queues are also created with x-ha-policy set to all).
>>
>
> As an aside, note that x-ha-policy only controls queue mirroring in 2.x!
> In 3.x it does nothing at all. See http://www.rabbitmq.com/ha.html and
> http://www.rabbitmq.com/blog/2012/11/19/breaking-things-with-rabbitmq-3-0/
>
> Of course, autodelete RPC queues quite possibly do not need to be HA.
>
>
>  But in the same time if a node of the cluster go down the Queue consumer
>> that are created in this node will be deleted, the Queues with
>> auto-delete will end up being deleted too and the same thing with the
>> exchanges bounded to them which also have auto-delete, all of this will
>> be done **eventually** in all RabbitMQ nodes that are still up.
>>
>> So in detail from neutron side we have:
>>
>> N1. Connect to node 2.
>> N2. Create Exchange X.
>> N3. Create Queue Q.
>> N4. Create Binding from Q to X.
>>
>>  From cluster side we have:
>>
>> R1. Delete consumer.
>> R2. Delete Queue Q (Binding is deleted explicitly).
>> R3. Delete Exchange X.
>>
>> This actually can lead to a race condition that will result of N4
>> failing with error stating that exchange doesn't exist because
>> apparently R3 action was executed after N2 and before N4.
>>
>
> That makes sense. But note that R2 and R3 are an atomic event.
>
>
>  A workaround that we have created is to retry creation if it fail as you
>> can see in the bug report in OpenStack side
>> https://bugs.launchpad.net/neutron/+bug/1318721.
>>
>
> And that would work as a workaround.
>
>
>  But i think that this also a problem in RabbitMQ side, basically i
>> believe a more sane behavior will be for RabbitMQ to ignore **old**
>> delete if a newest exchange's declare was sent, instead of threading
>> this later as a no op.
>>
>
> Hmm. Newer than what? The queue delete R2 is not necessarily older than
> then most recent exchange declare N2 in this scenario, is it? In fact,
> isn't R2 always newer than N2?
>
> (Also, doing this would require timestamps in queues, exchanges and
> bindings, updated cluster-wide every time they are re-declared. That would
> be a big problem for performance.)
>
>
>  Does my analyze above make sense !?
>>
>
> So I'm not sure it does.
>
>  When reading also the AMQP 0-9-1 reference
>> (https://www.rabbitmq.com/amqp-0-9-1-reference.html) i found that:
>>
>>   * The server SHOULD allow for a reasonable delay between the point
>>
>>     when it determines that an exchange is not being used (or no longer
>>     used), and the point when it deletes the exchange. At the least it
>>     must allow a client to create an exchange and then bind a queue to
>>     it, with a small but non-zero delay between these two actions.
>>
>>
>> Does my finding contradict this ? And how big is the delay ?
>>
>
> It does; there is no delay in RabbitMQ.
>
> Note that if there were a delay then you'd still have exactly the same
> race anyway, it wouldn't solve anything.
>
> You can get a similar delay for the queue deletion, by using x-expires
> instead of auto-delete. But once the queue goes, the bindings and exchange
> all go atomically.
>
> So how to solve your problem? It sounds like each thing that can receive
> RPC replies declares an exchange, queue and binding for that purpiose, with
> the exchange and queue not linked in to anything else. Is that correct?
>
> If that is so, why not only declare a queue, and have the RPC server
> publish to that queue directly? The queue could still be auto-delete, and
> everything would be simpler. Or is there some subtlety I'm not getting?
>
>
>  One last thing is the auto-delete deprecation from
>> http://www.rabbitmq.com/amqp-0-9-1-errata.html, point 25:
>>
>> The 'auto-delete' flag on 'exchange.declare' got deprecated in 0-9-1.
>> Auto-delete exchanges are actually quite useful, so this flag should be
>> restored.
>>
>> Does this mean the auto-delete flag will not be removed from RabbitMQ or
>> what ?
>>
>
> It means we will continue to support it, yes.
>
> Cheers, Simon
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140516/8f3c0458/attachment.html>