[rabbitmq-discuss] Clustering Issue
Jorge Varona
jvarona at attinteractive.com
Fri Aug 28 23:08:20 BST 2009
Mattias,
Thanks for the quick response. I understand that if the Server A dies
other nodes should be able to assume that the queue is dead as well. My
real concern is that the queue(s) are not distributed and the box that
carries the physical queue is a single point of failure. While the
current clustering helps load-balance read operations very well it
doesn't address durability and reliability concerns, which happen to be
important to my project.
I guess my real question is if there are or will be efforts to
distribute the queues across physical nodes? I understand that this
implies locking, consensus, and a bunch of other things that could
hinder scale. If not, do you have any recommendations on how I could
provide a higher durability and fault-tolerance guarantees to my
consumers?
Jorge
-----Original Message-----
From: Matthias Radestock [mailto:matthias at lshift.net]
Sent: Friday, August 28, 2009 2:52 PM
To: Jorge Varona
Cc: rabbitmq-discuss at lists.rabbitmq.com
Subject: Re: [rabbitmq-discuss] Clustering Issue
Jorge,
Jorge Varona wrote:
> I've noticed some issues with clustered boxes that are weird. For
> example, in a two-box cluster I have Client A sending messages to
Server
> A and Client B pulling messages from Server B. We already know that if
> we shut down Server A (it was first to declare a queue) messages stop
> being delivered to Server B and in turn Client B. The strange behavior
> I've noticed is that if I bring Server A back up and send messages to
it
> they are not relayed to Server B, which has Client B attached. Only
> after I restart Server B do messages begin to be relayed to Client B.
When node A dies, as far as B is concerned all the queues on A die too.
If client B attempted a 'basic.get', or indeed any other operation on
any of A's queues it would get a 'not_found' error.
BUT - and this is what you are seeing - there is no way in AMQP to
inform existing subscribers that a queue has vanished.
This isn't just a problem for clustering - you can run into the same
issue on just a single node if one client consumes from a queue and
another client removes that queue.
> Here are my assumptions:
>
> 1. Queues exist only on the server on which they were first declared.
>
> 2. Nodes within a cluster relay requests to the server on which the
> queue exists instead of messages being relayed to the server after
first
> received.
Both correct.
> are there efforts to address these issues/scenarios?
It is possible that AMQP 1.0 addresses this. Not sure.
Regards,
Matthias.
More information about the rabbitmq-discuss
mailing list