[rabbitmq-discuss] Active/Active failover and lost messages

Konstantin Kalin konstantin.kalin at gmail.com
Thu Nov 3 21:19:51 GMT 2011


Hello,

I'm playing with RabbitMQ in Active/Active mode (Mirrored queues). I
think it's very good product. Currently I have an issue and I could
not find a reason for that. I would like to understand if I use
RabbitMQ not correctly.
My setup is:
 a) 3 cluster nodes (2 disc, 1 ram)
 b) 50 publishers/consumers. All publishers and consumers are
distributed between nodes.
 c) Message publishing rate is about 15 messages per second per
publisher.
 d) Publishers work with "Publisher confirms" mode
 e) Consumers work "autoack=false"
 f) A publisher or client knows about all nodes in the cluster and
does failover to another node if there is an issue with current
connection.

If I stop a node some consumers don't get all messages (even a
consumer is connected with another node). About 2-3 messages are
lost.
I did several tests and found a correlation. Only consumers getting
"ConsumerCancelledException" lose the messages. Other consumers don't
lose any messages even if they are connected with the node I stop.
Could you please advise what I need to check to find a reason of the
issue?

Thank you,
Konstantin.


More information about the rabbitmq-discuss mailing list