[rabbitmq-discuss] What can cause mnesia partitioning?

Mon Oct 15 11:44:46 BST 2012

Hi

On 10/12/2012 07:06 PM, tsuraan wrote:
> We had a pair of computers running a rabbit cluster, and somehow their
> mnesia databases diverged.  Each computer was running its own rabbit
> happily, but they both had cluster_status messages showing only
> themselves as the only "running" node, and they both had log messages
> to the effect of:
>
> Mnesia('rabbit at node-1'): ** ERROR ** mnesia_event got
> {inconsistent_database, starting_partitioned_network, 'rabbit at node-2'}

This indicates that a netsplit has occurred; A situation that mnesia 
(and therefore clustered rabbit) does not handle well.

> I restarted both rabbit instances, and they both came up in an
> apparently functional single-node instance (cluster_status on each
> still showed the other node as a disc node, but not as a running
> node).  From my reading of http://www.rabbitmq.com/clustering.html, it
> doesn't seem like that should happen, unless each node was somehow
> convinced that it was the most up to date disc node.  Otherwise, one
> of the nodes should have waited 30 seconds for the other one, and then
> crashed if it couldn't be reached, right?  What sort of circumstances
> would cause both nodes to think they were the most up to date, and
> that they should continue running on their own?

A netsplit can cause this to happen. The 30 second delay behaviour is 
for when all the nodes in a cluster have been shut down cleanly, but 
this will not work if the mnesia databases have diverged due to a netsplit.

> Along that line, is there any way to configure a rabbit node to only
> run if it can contact a strict majority of disc nodes?  I think that
> would make this sort of problem less likely to happen, assuming the
> problem stems from a network partition, or perhaps even from some
> period of time where each machine was running while the other was not.

The latter case should not present a problem, as the mnesia databases 
won't have diverged as such, therefore mnesia will synchronise them 
whilst starting up the node that was offline temporarily. In terms of 
configuring rabbit to 'only run if it can contact a strcit majority of 
disc nodes', there are other issues to consider besides the number of 
nodes that can be contacted. Mnesia will not magically recover from 
netsplits without intervention, so this problem is actually non-trivial 
to fix - nevertheless, we are actively looking for solutions to this 
situation!

Cheers,
Tim