[rabbitmq-discuss] RabbitMQ production setup questions around clustering

Thu Jul 22 13:05:34 BST 2010

Hi Dave,

> Questions:
> 2. When a queue is declared (by one of our producers), is the queue always
> created on the rabbitmq node the producer client has a connection to or is the
> queue created on a randomly selected node within the cluster?

The queue is created only on that one node, but is accessible from all
nodes in the cluster.  All nodes in the same cluster are functionally
interchangeable, i.e. it doesn't matter which of the nodes you connect to,
you can still use the queue.

> 3. If the queue is durable and the messages sent to it are marked persistent,
> will these messages always be persisted to disk and be available after a restart
> of the node that has that queue, regardless of whether the node is a disk node
> or RAM node? (This line "Should you do this, and suffer           a power
> failure to the entire cluster, the entire state of           the cluster,
> including all messages, will be lost.         " in
> http://www.rabbitmq.com/clustering.html is confusing)

The distinction between RAM and disk nodes refers to where the Mnesia
tables are stored.  The routing configuration (what queues there are,
which exchanges they're bound to, etc.) is stored in Mnesia, so at least one
of the nodes should be a disk node.  This way, if both nodes go down
at the same time (power failure), when they're restarted, the cluster
should spring back to life just like before.

Messages are persisted to disk, regardless of whether the node is a
disc node or not.  Also, messages are persisted to disk only on the node
on which the queue was declared.  If said node is a RAM node and it goes
down, it won't lose the messages; when it reconnects to the cluster, the
queue should just get recreated (assuming it was declared as durable).

If both nodes are RAM nodes, and they both go down, when they're
restarted, they won't know what queues were declared, etc.  So, in this
case, the messages will be lost.

> 4. Should I configure both my nodes as disk nodes or will one disk node be
> sufficient? In other words, if only 1 disk node was there in the cluster and its
> hard drive went bust, what can I recover from the RAM node? If nothing can be
> recovered from the RAM node

As mentioned above, the RAM node will not store the routing configuration to disk.
If the disc node goes down, you should still be able to recover the broker
configuration from it by adding a new disc node to the cluster.

If any of the nodes goes down permanently (disk failure, etc.), the
messages stored on it will be lost.  If you don't want this to happen,
have a look at the HA Clustering guide:

  http://www.rabbitmq.com/pacemaker.html

> is it mainly for increasing the number of
> connections without taking any hits to disk throughput?

Pretty much, yes.

> 5. Are connections redirects actually supported by the current version? The FAQ
> and Clustering documents on site are contradictory of each other. (FAQ says
> "Future releases will support live failover using, for 	      instance, a
> combination of the "known hosts" field in connection.open-ok and the
> connection.redirect message.")
> 6. If redirects are supported, when all connections are being sent to a specific
> rabbitmq node in the cluster by the loadbalancer, will that rabbitmq node still
> send a redirect request to the client if it's getting too taxed? If so, then is
> it possible to limit redirects to only the disk nodes within the cluster so that
> we don't lose any data?
> 7. Will the connections be redirected solely based on RabbitMQ node's ability to
> serve it or is it more round-robin?

There is currently some loadbalancing done via AMQP's connection.redirect
methods.

Both redirects and loadbalancing will soon be removed from RabbitMQ.  It
would probably be better if you didn't count on this feature.

> 8. Is there a way to ensure that the cluster configuration is correct because
> 'disc/RAM' is not reported correctly by rabbitmqctl (as per
> http://old.nabble.com/Rabbitmq-v.-1.8.1---Bug-report---Could-not-start-a-node-as-RAM-node-in-cluster-ts29211668.html)?

Not in the current release, no.  It's been fixed on the develpment branch
and should make it into the next release.

Hope this clears things up.

Cheers,
Alex