[rabbitmq-discuss] Messages disappearing from queues when clustered brokers are restarted.

Simon MacMullen simon at rabbitmq.com
Wed Nov 23 16:59:05 GMT 2011


On 23/11/11 16:44, Andreas Lundberg wrote:
> Is there any way to prevent this kind of message loss, and why does it
> happen?

When you initially stop broker A, all the queues switch over to B as the 
master.

When you then start A again, all the queues get a new slave on A.

It's important to understand that these slaves initially start up 
without messages in them. They can't just pick up where they left off, 
as A might have been down for an arbitrary period of time and doesn't 
know what's happened.

These slaves are thus said to be "unsynchronised". You can see whether 
slaves are synchronised or not in rabbitmqctl and the management plugin.

New messages published to the queues will go to all slaves including the 
new ones. When the last old message is consumed/acked from the new 
master, the new slave will be considered synchronised again.

But I suspect you're restarting B before this happens. Then the 
unsynchronised slave on A becomes the new master, and messages are lost.

In the future we'd like to add the ability for new slaves to eagerly 
synchronise, but it hasn't happened yet and it's quite a big task.

Cheers, Simon

-- 
Simon MacMullen
RabbitMQ, VMware


More information about the rabbitmq-discuss mailing list