[rabbitmq-discuss] Query related to Rabbitmq Clustering.

Tim Watson tim at rabbitmq.com
Tue Dec 18 13:28:37 GMT 2012


On 13 Dec 2012, at 11:10, Vikrant Sayeewal wrote:

> If both the available rabbitmq nodes goes down then it means, I need to restart whole cluster in place of restarting those nodes??
> 

Restarting a single node *should* be enough, but caveats apply. If you published to node 1 and that message was non-persistent, then node 1 dies *before* the message can be relayed to node 2 (for example if the network between the two nodes becomes slow or unavailable during that time) then the message could be lost. As Simon points out, you need to use persistent messages to make sure they survive. 

Another scenario in which message loss can occur even when messages are persistent is that you publish to node 1 and it dies before transmitting the message to node 2 **and** before writing the message to disk.

The solution to this is to use publisher confirms. When confirms are enabled (setting the channel to confirm mode) and messages are persistent, then the broker will not send a confirm.ok until the message has been written to disk *and* the i/o buffer has been flushed/synchronised. The broker will also ensure that the message has been transmitted to node 2 and confirmed as persisted there. In this case, no message loss will occur such that once you have seen a confirm for the message, even if both nodes go down, restarting one node will be enough to access the message.

Cheers,
Tim


More information about the rabbitmq-discuss mailing list