[rabbitmq-discuss] Clustering Issue
jvarona at attinteractive.com
Fri Aug 28 23:08:20 BST 2009
Thanks for the quick response. I understand that if the Server A dies
other nodes should be able to assume that the queue is dead as well. My
real concern is that the queue(s) are not distributed and the box that
carries the physical queue is a single point of failure. While the
current clustering helps load-balance read operations very well it
doesn't address durability and reliability concerns, which happen to be
important to my project.
I guess my real question is if there are or will be efforts to
distribute the queues across physical nodes? I understand that this
implies locking, consensus, and a bunch of other things that could
hinder scale. If not, do you have any recommendations on how I could
provide a higher durability and fault-tolerance guarantees to my
From: Matthias Radestock [mailto:matthias at lshift.net]
Sent: Friday, August 28, 2009 2:52 PM
To: Jorge Varona
Cc: rabbitmq-discuss at lists.rabbitmq.com
Subject: Re: [rabbitmq-discuss] Clustering Issue
Jorge Varona wrote:
> I've noticed some issues with clustered boxes that are weird. For
> example, in a two-box cluster I have Client A sending messages to
> A and Client B pulling messages from Server B. We already know that if
> we shut down Server A (it was first to declare a queue) messages stop
> being delivered to Server B and in turn Client B. The strange behavior
> I've noticed is that if I bring Server A back up and send messages to
> they are not relayed to Server B, which has Client B attached. Only
> after I restart Server B do messages begin to be relayed to Client B.
When node A dies, as far as B is concerned all the queues on A die too.
If client B attempted a 'basic.get', or indeed any other operation on
any of A's queues it would get a 'not_found' error.
BUT - and this is what you are seeing - there is no way in AMQP to
inform existing subscribers that a queue has vanished.
This isn't just a problem for clustering - you can run into the same
issue on just a single node if one client consumes from a queue and
another client removes that queue.
> Here are my assumptions:
> 1. Queues exist only on the server on which they were first declared.
> 2. Nodes within a cluster relay requests to the server on which the
> queue exists instead of messages being relayed to the server after
> are there efforts to address these issues/scenarios?
It is possible that AMQP 1.0 addresses this. Not sure.
More information about the rabbitmq-discuss