[rabbitmq-discuss] Are queues replicated across a cluster?

Fri Jan 14 17:08:32 GMT 2011

Hi Bill,

On Thu, Jan 13, 2011 at 06:42:19AM -0800, Bill Moseley wrote:
> I don't understand that sentence that says the queues only reside on the
> node that created them -- but they are visible on both.  Is that talking
> about replicating *messages* or queues?  Queues and messages are separate
> things.  I understand that messages are not duplicated/replicated in the
> cluster.

Within a cluster, the fact that a queue exists or not is recorded in a
distributed Erlang database, called mnesia. Thus all nodes within the
cluster know that a particular queue exists. Each queue however, only
exists on a single node. I.e. only one node will receive the messages
for that queue, and if that node goes down then the queue, and the
messages it contains, become unavailable.

The fact that the queue's existence is known to all nodes means that any
node that needs to route a message to said queue will be able to do so.

> What's the distinction between "residing" and "visible"?  Does it mean the
> messages are only stored on node1 until consumed (by either node)?

Yes, precisely.

> But, I
> can declare the same queue on both nodes and see the same behavior.  Can
> someone please clear this up for me?

The second declaration is really just an assertion that the queue
exists. Thus the queue will be created on the node to which the
connection which first issues the queue declaration is connected.

> What I have failed to find when reading about clustering is how to connect
> to the nodes in a cluster.  If I have node1 and node2 in my cluster and 10
> producers (say web servers) and 4 worker machines do I configure 5 producers
> to connect to node1 and 5 to node2, and then likewise split the consumer
> connections across the two nodes?

Entirely up to you - resources are visible throughout the cluster and so
any sane load balancing should be just fine. If they all connect to the
same node then you'll likely find that all resources exist solely on
that node and thus the other nodes are not being used at all. Thus
really, you want to make sure that the clients which create resources
(queues in particular) are connected across all the nodes.

The other thing to consider is the inter-node hops required to route
messages. Thus you may wish to make sure that if client 1 is
predominantly sending messages to or consuming messages from queue 1,
that client 1 is connected to the same node on which queue 1 resides.

> To be clear, this is not HA.  So, if node1 fails then 1/2 the producers and
> 1/2 the consumers can no longer function.  Correct?

Well, as explained, it depends where the queues reside, but yes, you
will likely lose some resources, and all the clients connected to the
failed node will need to reconnect to a surviving node.

> Or is the typical approach to use something like HAProxy in front of node1
> and node2 and have all consumers and producers connect to HAProxy?  (That
> ends up as a single point of failure.)

Yup, indeed. Though obviously with enough magic hackery you can even
remove the SPoF with something like an HAProxy. But it does depend on
your precise performance requirements etc.

Matthew