[rabbitmq-discuss] AWS clustering

Matthias Radestock matthias at rabbitmq.com
Wed Sep 5 10:01:47 BST 2012


On 05/09/12 09:46, Francesco Mazzoli wrote:
> At Tue, 4 Sep 2012 20:01:38 -0700 (PDT),
> Glade wrote:
>> For a supposedly "just works" kind of service, that is just not good enough. I
>> can't have my ops people rolling out of bed to take action every time there's
>> a minor network glitch.
>
> Rabbit clustering is meant to be run on local networks and is not tolerant to
> "network glitches".  If you expect those, then don't use it.

There is in fact some built-in tolerance to glitches. Specifically, TCP 
should be tolerant to network glitches if appropriately configured (or, 
rather, unless inappropriately misconfigured).

You may also want to increase Erlang's kernel net_ticktime. See 
http://www.erlang.org/doc/man/kernel_app.html. And make sure you are 
running the most recent Erlang release.

If, however, the glitch is severe enough to exceed those tolerances 
then you'll end up with a proper network split, which, as Francesco 
notes, Erlang's mnesia distributed db (on which rabbit's cluster is 
based) cannot cope with, thus requiring manual intervention to recover.

Regards,

Matthias.


More information about the rabbitmq-discuss mailing list