[rabbitmq-discuss] RabbitMQ clustering and node failure recovery strategies

Alexandru Scvorţov alexandru at rabbitmq.com
Thu Oct 7 11:47:02 BST 2010


> What do you mean by maskerading? Is cloning
> IP/hostname/cookie enough? How about Mnesia?

IIRC, just clustering a new node with the same nodename as the down node
should be enough.  So, the same hostname and cookie should do it.

If you clone everything, you'll probably end up with a setup similar to
the one in our HA guide.

Really though, deleting queues and immediately declaring them on a
different node doesn't seem right.  You're probably better of using
non-durable queues.

Alex

On Thu, Oct 07, 2010 at 02:58:13AM -0700, Armax wrote:
> 
> That helps, thanks! What do you mean by maskerading? Is cloning
> IP/hostname/cookie enough? How about Mnesia?
> 
> 
> 
> 
> Alexandru Scvorţov wrote:
> > 
> > Hi,
> > 
> >> I have been reading the mailing list to gather information about
> >> scalable
> >> and HA configurations for RabbitMQ. From my understanding, I assune
> >> that if
> >> a RabbitMQ node in a cluster fails all the queues and messages on that
> >> node
> >> will be lost until the node is not recovered and any attempt of
> >> re-creating
> >> the queues on another node is forbidden by the cluster implementation.
> >> Is
> >> still the case in the latest RabbitMQ release?
> > 
> > We're talking about durable queues.  Yes, this is still the case with
> > the latest broker.  You can't recreate the lost queue on a different
> > node
> > because it causes all sorts of problems if the original node comes back
> > up.
> > 
> >> I was thinking about fail-over strategies and I was wondering if there
> >> is
> >> any way to say to the cluster to forget about the binding
> >> (queue-cluster
> >> node) after the failure. If I did that, then I could re-create the
> >> queue on
> >> another node.
> > 
> > The only way to remove a queue from a downed node is either to restart
> > the node and remove the queue, or have another node masquerade as that
> > node and remove it.  In the second case, you lose the messages.
> > 
> > If you don't care about losing some messages, you can just use
> > non-durable queues.  This way, when the node goes down, the queue is
> > deleted from the cluster and can be redeclared on any other node.  Of
> > course, whatever messages were on the queue are lost.
> > 
> > Does this help?
> > 
> > Cheers,
> > Alex
> > 
> > 
> > On Wed, Oct 06, 2010 at 02:26:57AM -0700, Armax wrote:
> >> 
> >> Hi,
> >> 
> >> I have been reading the mailing list to gather information about scalable
> >> and HA configurations for RabbitMQ. From my understanding, I assune that
> >> if
> >> a RabbitMQ node in a cluster fails all the queues and messages on that
> >> node
> >> will be lost until the node is not recovered and any attempt of
> >> re-creating
> >> the queues on another node is forbidden by the cluster implementation. Is
> >> still the case in the latest RabbitMQ release?
> >> 
> >> I was thinking about fail-over strategies and I was wondering if there is
> >> any way to say to the cluster to forget about the binding (queue-cluster
> >> node) after the failure. If I did that, then I could re-create the queue
> >> on
> >> another node. 
> >> 
> >> If this is not available, is there a specific reason why this is not
> >> possible?
> >> 
> >> Many thanks and keep up the good work :)
> >> -- 
> >> View this message in context:
> >> http://old.nabble.com/RabbitMQ-clustering-and-node-failure-recovery-strategies-tp29894938p29894938.html
> >> Sent from the RabbitMQ mailing list archive at Nabble.com.
> >> 
> >> _______________________________________________
> >> rabbitmq-discuss mailing list
> >> rabbitmq-discuss at lists.rabbitmq.com
> >> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
> > _______________________________________________
> > rabbitmq-discuss mailing list
> > rabbitmq-discuss at lists.rabbitmq.com
> > https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
> > 
> > 
> 
> -- 
> View this message in context: http://old.nabble.com/RabbitMQ-clustering-and-node-failure-recovery-strategies-tp29894938p29904767.html
> Sent from the RabbitMQ mailing list archive at Nabble.com.
> 
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss


More information about the rabbitmq-discuss mailing list