Active/Active failover and lost messages

Konstantin Kalin konstantin.kalin at gmail.com
Thu Nov 3 21:19:51 GMT 2011


I'm playing with RabbitMQ in Active/Active mode (Mirrored queues). I
think it's very good product. Currently I have an issue and I could
not find a reason for that. I would like to understand if I use
RabbitMQ not correctly.
My setup is:
 a) 3 cluster nodes (2 disc, 1 ram)
 b) 50 publishers/consumers. All publishers and consumers are
distributed between nodes.
 c) Message publishing rate is about 15 messages per second per
 d) Publishers work with "Publisher confirms" mode
 e) Consumers work "autoack=false"
 f) A publisher or client knows about all nodes in the cluster and
does failover to another node if there is an issue with current

If I stop a node some consumers don't get all messages (even a
consumer is connected with another node). About 2-3 messages are
I did several tests and found a correlation. Only consumers getting
"ConsumerCancelledException" lose the messages. Other consumers don't
lose any messages even if they are connected with the node I stop.
Could you please advise what I need to check to find a reason of the

Thank you,

