<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Hi Elias,<br>

    <br>

    As I'm fairly new around here, I'll try and share what I've learned

    so far and allow the more experienced folks to chip in and fill in

    the details (or correct me if I go astray).<br>

    <br>

    On 07/11/2012 12:11 AM, Elias Levy wrote:

    <blockquote

cite="mid:CAFDmHdMA-7QCJLf=R3TjSorKf3R1EwnrEJvyj9RPsLPagCw4-w@mail.gmail.com"

      type="cite">I am curious as to what the behavior of HA queues

      during a network split is. &nbsp;

      <div><br>

      </div>

      <div>The documentation states that when a mater fails a slave will

        be promoted to master, but its silent under what conditions a

        slave will consider a master to have failed. &nbsp;Is there some

        timeout after which slaves will consider a master to have

        failed? &nbsp;If so, what is the time value?</div>

      <div><br>

      </div>

    </blockquote>

    <br>

    This situation is not handled using a timeout. HA queues are based

    on a technology called Guaranteed Multicast (aka GM), which was

    developed independently by and for RabbitMQ. This provides an atomic

    broadcast capability which is similar to the work described by Levy

    et al (<cite>biblion.epfl.ch/EPFL/theses/2008/3999/EPFL_TH3999.pdf)

    </cite>though as I mentioned earlier (and as per the documentation),

    was developed independently.<br>

    <br>

    You can take a look at the GM source code here:

    <a class="moz-txt-link-freetext" href="http://hg.rabbitmq.com/rabbitmq-server/file/default/src/gm.erl">http://hg.rabbitmq.com/rabbitmq-server/file/default/src/gm.erl</a><br>

    <br>

    A GM group forms a ring, in which members are connected to their

    immediate neighbours (in both directions) only. If this connection

    breaks then the death of the member is propagated around the ring

    and everything 'reshuffles' to compensate for this. The deaths are

    noticed because the Erlang processes involved are monitored (see the

    links under [monitors] at the bottom for technical details) and the

    guarantees and relative timings involved can be understood in that

    context.<br>

    <br>

    In actual fact, mirror (i.e., HA) queues are implemented 'on top of'

    GM and also rely on Rabbit's clustering infrastructure, so

    additional (Erlang) process and node monitoring is in place at the

    level above GM which will also *notice* if a node goes down.<br>

    <br>

    <blockquote

cite="mid:CAFDmHdMA-7QCJLf=R3TjSorKf3R1EwnrEJvyj9RPsLPagCw4-w@mail.gmail.com"

      type="cite">

      <div>Assuming that such timeout exists, if there is a network

        split you may end up with two clusters, each one which now has a

        master. &nbsp;Each may also have publisher and consumers that

        continue to work happily against the split cluster.</div>

      <div><br>

      </div>

    </blockquote>

    <br>

    Now we're talking about two different things. Rabbit clustering is

    independent of mirror (HA) queues, though the two things are

    interdependent. If a netsplit occurs then the surviving nodes which

    are still connected to the extant master *should* continue happily

    on. What will happen to the nodes in the other 'half' of the split,

    I'm not so sure and will put my hand up and ask someone better

    versed in this to fill in the blanks.<br>

    <br>

    <blockquote

cite="mid:CAFDmHdMA-7QCJLf=R3TjSorKf3R1EwnrEJvyj9RPsLPagCw4-w@mail.gmail.com"

      type="cite">

      <div>What happens when the network split is repaired? &nbsp;Will the

        clusters join? &nbsp;If so, what will happen to the HA queue? &nbsp;Will

        one of the existing master be demoted to slave? &nbsp;If so, what

        happens to its queue of messages that originated within its

        split cluster? &nbsp;Are they lost?</div>

      <div><br>

      </div>

    </blockquote>

    <br>

    AFAIK it is possible for MNesia to heal itself after a netsplit, and

    therefore getting nodes to rejoin a cluster might work without

    intervention, possibly depending on what has happened independently

    on the two 'halves' of the split in the intervening time period.

    What I would not expect to happen (though I could be wrong here!) is

    for two distinct GM rings to join up and become one, promoting a new

    master or demoting an existing one, the latter behaviour being

    undefined (i.e., not implemented) AFAICT.<br>

    <br>

    When a node rejoins a cluster, mnesia needs to reconcile the

    differences and I would expect to see mnesia fail when trying to

    rejoin the cluster if the (Erlang) process ID for the master was

    different between the two nodes.<br>

    <br>

    <blockquote

cite="mid:CAFDmHdMA-7QCJLf=R3TjSorKf3R1EwnrEJvyj9RPsLPagCw4-w@mail.gmail.com"

      type="cite">

      <div>I suppose a lot of this depends on the underlaying Mnesia DB.

        &nbsp;I&nbsp;realize&nbsp;RMQ is CA system out the CAP theorem, but its not at

        all clear what occurs in the face of a network partition.</div>

      <br>

    </blockquote>

    <br>

    Yes indeed - mnesia does not play nicely in this kind of scenario.

    There are some efforts underway to make it *easier* to deal with

    netsplits (for example

    <a class="moz-txt-link-freetext" href="https://github.com/uwiger/otp/commit/3f70f3def4e33828da4237b07cbee9f73121c661">https://github.com/uwiger/otp/commit/3f70f3def4e33828da4237b07cbee9f73121c661</a>

    and <a class="moz-txt-link-freetext" href="https://github.com/uwiger/unsplit">https://github.com/uwiger/unsplit</a>) but these are not mainstream

    or ready to production use just yet.<br>

    <br>

    And even if some mechanism were available, we would have the dual

    problems of deciding on which mnesia record is the correct (system

    of record) *and* being able to join 2 GM rings back together, which

    sounds infeasibly hard to me.<br>

    <br>

    [monitors]<br>

    <a class="moz-txt-link-freetext" href="http://www.erlang.org/doc/reference_manual/processes.html#id82613">http://www.erlang.org/doc/reference_manual/processes.html#id82613</a><br>

    <a class="moz-txt-link-freetext" href="http://www.erlang.org/doc/man/erlang.html#monitor-2">http://www.erlang.org/doc/man/erlang.html#monitor-2</a><br>

    <a class="moz-txt-link-freetext" href="http://www.erlang.org/doc/man/net_kernel.html">http://www.erlang.org/doc/man/net_kernel.html</a><br>

    <br>

  </body>

</html>