<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Hi Jacques,<br>

    <br>

    Have you posted details about this to the mailing list previously? I

    didn't see anything specific from you in the last week or so.<br>

    <br>

    Would you be able to provide logs and/or further information about

    your setup? Obviously we're keen to track down any bugs that cause

    operational issues and resolve them asap.<br>

    <br>

    Cheers,<br>

    Tim<br>

    <br>

    On 04/19/2013 04:06 PM, Jacques Doubell wrote:

    <blockquote

      cite="mid:398f5f6d-22c0-49ec-989b-6cfae1416cda@googlegroups.com"

      type="cite">We have also recently upgraded to 3.0.4 and have since

      then had 2 outages. In the one case the service was running but

      non functional. The logs didn't have errors, but at a certain

      point just stopped receiving new connections. We had to restart

      the service and all was well until about a week later when there

      were a lot of heaped up messages server side but clients could not

      connect to the queue anymore. (server actively refused connection

      message from the client side). We will be downgrading to 2.8.x in

      the mean time.<br>

      <br>

      On Friday, April 12, 2013 8:36:22 PM UTC+2, Matt Wise wrote:

      <blockquote class="gmail_quote" style="margin: 0;margin-left:

        0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">We've been

        running RabbitMQ 2.8.x in production in Amazon for about 16

        months now without very many issues. Last week we ran into an

        issue where our 2.8.5 cluster nodes hit their high-memory-limit

        and stopped processing jobs, effectively taking down our entire

        Celery task queue. We decided to upgrade the software to 3.0.4

        (which had been running in staging for a few weeks, as a single

        instance, without issue) and at the same time beef up the size

        and redundancy of our farm to 3 machines that were m1.larges.

        <div><br>

        </div>

        <div>Old Farm:</div>

        <div>&nbsp; server1: m1.small, 2.8.5, us-west-1c</div>

        <div>&nbsp; server2: m1.small, 2.8.5, us-west-1c</div>

        <div><br>

        </div>

        <div>New Farm:</div>

        <div>&nbsp; server1: m1.large, 3.0.4, us-west-1a</div>

        <div>&nbsp; server2: m1.large, 3.0.4, us-west-1c</div>

        <div>&nbsp; server3: m1.large, 3.0.4, us-west-1c</div>

        <div><br>

        </div>

        <div>Since creating the new server farm though we've had 3

          outages. In the first two outages we received a Network

          Partition Split, and effectively all 3 of the systems decided

          to run their own queues independently of the other servers.

          This was the first time we'd ever seen this failure, ever. In

          the most recent failure we had 2 machines split off, and the

          3rd rabbitmq service effectively became unresponsive entirely.</div>

        <div><br>

        </div>

        <div>For sanity sake, at this point we've backed down to the

          following configuration:</div>

        <div><br>

        </div>

        <div>New-New Farm:</div>

        <div>&nbsp; server1: m1.large, 2.8.5, us-west-1c</div>

        <div>&nbsp; server2: m1.large, 2.8.5, us-west-1a</div>

        <div><br>

        </div>

        <div>Up until recently though I had felt extremely comfortable

          with RabbitMQ's clustering technology and reliability... now

          ... not so much. Has anyone else seen similar behaviors? Is it

          simply due to the fact that we're running cross-zone now in

          Amazon, or is it more likely the 3 servers that caused the

          problem? Or the 3.0.x upgrade?</div>

        <div><br>

        </div>

        <div>--Matt</div>

      </blockquote>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

rabbitmq-discuss mailing list

<a class="moz-txt-link-abbreviated" href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a>

<a class="moz-txt-link-freetext" href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>