[rabbitmq-discuss] Pathological dead queue behavior in a cluster.
Jason J. W. Williams
jasonjwwilliams at gmail.com
Wed Mar 30 22:17:19 BST 2011
I've run into an interesting edge case that I don't seem to have a solution
for. I've got a local cluster with three nodes (2 disk, 1 RAM) called:
rabbit, rabbit_1 and rabbit_2.
If I attach a consumer (https://gist.github.com/895274) to rabbit_1
(declares queue, exchange and bindings) and start consuming, I can publish (
https://gist.github.com/895275) to rabbit or rabbit_2 and the messages are
consumed as expected. At this point the queue is living on rabbit and my
consumer for it is attached to rabbit_1.
If I shutdown rabbit, the queue on it disappears as expected (the queue at
this point is non-durable). However, the queue's consumer on rabbit_1 sits
there fat dumb and happy even though no queue exists anymore. Is this
To further test, I changed the queue to a durable one and replayed the
previous events. Except this time since the consumer was still fat dumb and
happy, I restarted the rabbit node. Then I published again. Still nothing on
the consumer and the queue (which had returned) had a queue count of 0.
Doing a list_bindings showed the binding to still exist. But no amount of
publishing would end up in the restored queue or the consumer. The only
thing that allowed the queue to receive messages again (and the consumer to
receive them) was to redeclare the binding. This seems like an error since
the binding was reported as still existing.
Any help or pointers are greatly appreciated, as with this behavior
consumers of queue in a cluster could sit forever and never consume a
message after a failed node (and if the queue was durable, messages appear
to be blackholed despite the node being restored). This was on RabbitMQ
2.4.0 on OS X 10.6.7.
Thank you in advance.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rabbitmq-discuss