[rabbitmq-discuss] Replicated message queue? DRDB+Hearbeat?

Wed Feb 20 14:39:21 GMT 2008

Hi Tom,

Tom Samplonius wrote:
>   I see that RabbitMQ does not yet have replication for message queues.  I need a high-available message queue that never loses messages.  I'm not terribly concerned about performance, as I just need to process several thousand messages per day.  It is just hard to repair consistency if any are lost.

I am also in a very similar situation - HA is more important for my project than 
throughput. It is my understanding that if you define your queues as durable and send 
messages as persistent, your messages will be saved on disk until consumed. A bigger 
question is how long it will take between a message is sent and the message is consumed. 
Out of the box, I guess it depends on how fast you can restore the original broker (app, 
system and/or network connectivity, depending on what failed).

>   I thought about using DRDB to replicate RabbitMQ's on-disk Mnesia store to a standby node.  I've used DRDB successfully for MySQL, and it works well.  And then use Hearbeat to start RabbitMQ on the standby node, which would then recover the Mnesia tables, and start running.
So you are planning to produce to a broker and consume from the same broker. And once that 
broker fails, you will have both producers and consumers shift to another broker, which 
you will start on DRBD-cloned disk after the first broker failed?

I am currently leaning towards a cluster of rabbitmq brokers. Producers will send messages 
to one broker, consumers will connect to *all* brokers in a cluster and will  have 
application logic to identify duplicate messages (just in case). Producers will get an 
exception if their broker failed to properly receive their message and will reconnect and 
resend (tested with QPid python, planning to switch py-amqplib shortly). Consumers might 
not get an exception about broker failure soon enough, therefore for now I will have them 
consuming from all brokers in the cluster.

I hope that latter approach will work better for me, since in my case I might have trouble 
convincing all consumers at the same time that one particular broker is not available 
(some of my consumers are over LAN, some are over WAN; if I lose WAN access, consumers on 
the WAN will think the broker is gone, while consumers on LAN will still be able to talk 
to it).

Please let us know if you get to deploy rabbitmq with DRBD-backed storage and how it works 
for you.

Thanks,
Dmitriy