simon at rabbitmq.com
Wed Mar 20 10:30:39 GMT 2013
On 19/03/13 23:59, Ben Hood wrote:
> I appreciate the development is still speculative, but I was wondering
> what the initial scope of partition autohealing is likely to be.
My current aim (aims subject to change) is to have three modes for
dealing with partitions. One mode is "manual intervention required" and
is essentially the same as 3.0.
The second mode is "pause the minority". In this mode, if any nodes look
at the cluster and see that only a minority of nodes are up (from their
point of view) they essentially hibernate (refusing to accept
connections or do any other work) until they see they are in a majority
again. Thus partitions can never form, at the cost of availability (ie.
it's CP in CAP terms).
The idea is that "pause the minority" mode would be useful if you were
to have a cluster in something like EC2, spread across three or more
AZs. Your expected failure mode is for a single AZ to go completely
offline - in which case you don't really care whether the cluster nodes
within that AZ are running or not, so taking them down is an easy way to
ensure we have no partitions.
Obviously "pause the minority" mode is a bad idea with only two cluster
nodes (or AZs). "Pause the minority" mode is already complete and will
be in 3.1.
The third mode (currently under development; hopefully in 3.1) is
autohealing. In this mode partitions can still form, but once the
cluster becomes connected again a winning partition is chosen and all
non-winning nodes restarted to heal the partition - in other words
mimicking what a sysadmin would do with 3.0. The interesting question is
how to choose the winning node; my current thinking is some heuristic
based on the number of nodes in the partition, the number of open AMQP
connections at the moment we decide to heal, and possibly the number of
changes that have been made in Mnesia (if we can get hold of that
More information about the rabbitmq-discuss