<div dir="ltr">Ditto on what Michael said on disk IO.  Rabbit has been rock solid for us in prod with no partitioning for at least a year and a half+ (We DID have some operating system issues due to a kernel bug around 208 days of up time).  Whenever I've seen problems with rabbit, it's because the disk gets overloaded and rabbit doesn't seem to handle that very well.  I remember at the 2.8 days running load tests with HIPE and out-right crashing rabbit after about 5 minutes at a constant 90% disk utilization.  Which reminds me I need to try some load tests again, see how 3.3 handles things.  <div>

<br></div><div>Jason</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, May 30, 2014 at 5:24 AM, Laing, Michael <span dir="ltr"><<a href="mailto:michael.laing@nytimes.com" target="_blank">michael.laing@nytimes.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Experiences vary - we cluster across AWS zones in multiple regions and have not had a partition in production in over a year.<div>

<br></div><div>I have experimented with this and the basic rules are: always be sure you have available compute capacity by over-provisioning your clustered instances relative to their load; never go into IO wait state (avoid intense disk usage brought on by swapping or needless message persistence).</div>


<div><br></div><div>Also, avoid the us-east-1 region as it has the most usage, oldest hardware, highest latencies, and highest incidence of failures.</div><div><br></div><div>ml</div></div><div class="gmail_extra"><br><br>


<div class="gmail_quote"><div><div class="h5">On Thu, May 29, 2014 at 10:24 PM, Fei Yao <span dir="ltr"><<a href="mailto:mail2fei@gmail.com" target="_blank">mail2fei@gmail.com</a>></span> wrote:<br></div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div><div class="h5">

<div dir="ltr">Richard,<br>I'm not an expert on GCE. But if there are in the same zone it will not be considered WAN. We had some experience in AWS, if it's Multi-AZ it will be considered WAN, and we've experience 3 network partitions last year.<br>


<br>Best Regards<div><div><br><br>On Tuesday, May 6, 2014 9:46:39 AM UTC-4, Richard Tier wrote:<blockquote class="gmail_quote" style="margin:0;margin-left:0.8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">


<div>The <a href="https://www.rabbitmq.com/clustering.html" target="_blank">docs</a> on clustering state that "clustering does not tolerate network partitions well", and states to avoid clustering on WAN.</div>

<div>

<br></div><div>I'm using Google Cloud Engine <span style="font-size:13px">instances </span><span style="font-size:13px">- all of which are in the same zone.</span></div><div><span style="font-size:13px"><br></span></div>


<div>Should this setup be considered a WAN?</div></div></blockquote></div></div></div><br></div></div><div class="">_______________________________________________<br>

rabbitmq-discuss mailing list<br>

<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com" target="_blank">rabbitmq-discuss@lists.rabbitmq.com</a><br>

<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>

<br></div></blockquote></div><br></div>

<br>_______________________________________________<br>

rabbitmq-discuss mailing list<br>

<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>

<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>

<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div dir="ltr">Jason McIntosh<br><a href="https://github.com/jasonmcintosh/" target="_blank">https://github.com/jasonmcintosh/</a><br>573-424-7612</div>

</div>