[rabbitmq-discuss] Complete cluster crash (2.7.1)

Tue May 22 11:44:51 BST 2012

Hi Christian,

On 21/05/12 14:01, Christian Bick wrote:
> we had a complete crash of our 3-Node 2.7.1 cluster some days ago. What
> I saw on the web, this issue might be related to
> http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2012-April/019765.html
> . As such it may have been fixed in 2.8.x which we are going to migrate
> to. Nevertheless, I would like to ask for clarification on that if possible.

The errors that you found do not appear identical to the ones reported
in that message from April. It is possible that you are experiencing a
consequence of a bug that has already been fixed in v2.8.2. It would
help if you could upgrade and repeat the procedure that triggered the
bug. We will be in a much better position to diagnose errors reported
against the latest version of the broker.

You should be aware that if you restart all nodes quicker than it takes
a message to move through the queue then you risk message loss. The
section on unsynchronised slaves has more information:

http://www.rabbitmq.com/ha.html#unsynchronised-slaves

-Emile