[rabbitmq-discuss] Cluster Pathology

Fri Feb 13 10:26:00 GMT 2009

Drew,

On Thu, Feb 12, 2009 at 7:31 PM, Drew Smathers <drew.smathers at gmail.com> wrote:
> Steps to reproduce:
>
> 1. run publisher and consumer against one node to ensure queue is created there:
>
>  $ python publisher.py hostB 5
>  $ python consumer.py hostB # CTL-C after receiving 5 messages
>
> 2. run publisher/consumer against other node - hostA
>
>  $ python publisher.py hostA 20
>  $ python consumer.py hostA
>
> 3. Before publisher from step 2 has finished, bring down rabbitmq on hostB
>
>  hostB $ rabbitmqctl stop
>
> 4. After publisher from step 2 has finished, restart consumer:
>
>  $ python consumer.py hostA
>
> Notice messages delivered after hostB was brought down were not delivered.

Yes, this is the behaviour I would expect as well. As indicated
previously (on this thread and the other related one) this is because
the queue to which both consumers are subscribed was initially
declared on node B. Because there is

a) no automatic failover, just recovery;
b) no propagation of the queue removal event to each consumer (the
spec compliancy issue);

the queue is taken down and the guy consuming via node A will be none
the wiser. Any subsequent messages published to that queue will be
treated as unroutable and hence will be discarded. To recover from
this situation, you would need to restart node B and restart the
consumer on node A.

Obviously it would be nice to have better handling for this kind of
thing, which will probably happen at some stage.

HTH,

Ben