[rabbitmq-discuss] Cluster capability?

Thu Nov 28 13:36:31 GMT 2013

Hi Simon,

Before I get too much deeper into this, let me do two things: ask a fundamental question, and then try to explain some of the project design.

The question: I seem to recall that a queue can be construed as an outbound route from an exchange. In RabbitMQ we publish to an exchange and we consume from a queue. If a message arrives at an exchange and there is no outbound route for it, then the broker discards the message (I think). But what if there is no component that reads from the queue? A queue can come to be by static configuration or programmatically. It can be mirrored to other nodes via HA queues; but if it's mirrored from computer A to computer B and yet there is no consuming software on B, then what happens to messages sent to the (queue owning) exchange?

As to the design of this project: I like the cluster idea because it seems to offer a way for "control" components on every node to share a unified view of their world. Computers and my services thereon can join (and leave) the cluster, and by publishing to their LOCAL broker keep each other informed about themselves and the services they offer. 

But some of the applications (as opposed to the control components) in this architecture need to share what I'll call a "task queue." The applications write to this queue (to its exchange), messages that represent work that needs to be done. The same applications, in a kind of recursive decomposition, read from the same queue and execute the work indicated by these messages. Now, with many such applications spread over several computers, the workload is nicely parallelized. 

My concern is that in an HA Queue implementation, a given message could be consumed more than once, i.e., once on computer A and once on B. Consequently, the same unit of work will be executed twice, and that's no good.

I'm not sure if I've fully conveyed why I find this aspect troublesome. But please give me your thoughts on it, and we can always dig deeper.

Thank you.

Cordially,

Paul

> On Nov 26, 2013, at 4:55 AM, Simon MacMullen <simon at rabbitmq.com> wrote:
> 
>> On 25/11/2013 18:10, Paul Bell wrote:
>> "Virtual hosts, exchanges, users, and permissions are automatically
>> mirrored across all nodes in a cluster. Queues may be located on a
>> single node, or mirrored across multiple nodes
>> <http://www.rabbitmq.com/ha.html>. A client connecting to any node in a
>> cluster can see all queues in the cluster, even if they are not located
>> on that node."
>> 
>> Question: does the last sentence apply to both queues located on a
>> single node and to mirrored (HA) queues?
> 
> Yes.
> 
>> The last clause of that
>> sentence, e.g., "..even if they are not located on that node" suggests
>> that whether or not the queues are mirrored shouldn't matter, i.e.,
>> client will see the queues regardless of the cluster it connects to.
> 
> Yes. And of course just because a queue is mirrored doesn't mean it has to be on *every* node in the cluster - you don't have to use ha-mode: all.
> 
>> The cluster approach "feels" complex to me but that is, at least in
>> part, due to the fact that my once modest Rabbit skills have atrophied
>> over the last couple of year. So, here's some thinking out loud with
>> attendant questions:
>> 
>> 1. Let's see, so, the several computers on which my (Java) RabbitMQ
>> component runs will be joined together in a cluster.
>> 2. What happens if Computer A creates some exchanges and then computer
>> B, either before or after joining the cluster, creates the same
>> exchanges? Maybe the usual Rabbit idempotency holds here; i.e., if
>> exchange exists within the cluster, it's not going to be re-created...?
> 
> Yes, the exchange (along with other metadata) is held in a distributed database and created atomically.
> 
>> 3. And if the component publishes to an exchange that message should (I
>> think) be visible on the mirrored exchanges on all of the other nodes in
>> the cluster, right?
> 
> Yes. We don't normally talk about "mirrored exchanges" since all exchanges are inherently mirrored.
> 
>> 4. And, moreover, once a consumer on any cluster node reads that
>> message, no other consumer will be able to read it. I sure hope this is
>> true.
> 
> If it only got routed to one queue. Obviously if it was routed to two queues it can be read twice, and so on.
> 
> Also the message can be seen by two consumers if one consumer is consuming in ack mode, then sees the message and either rejects it or crashes before acknowledging it. That's true on a single node though, nothing to do with clustering.
> 
> Cheers, Simon
>