<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Oct 25, 2013 at 6:16 AM, Simon MacMullen <span dir="ltr"><<a href="mailto:simon@rabbitmq.com" target="_blank">simon@rabbitmq.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"></div>
OK, I've replicated crashes with the script rebalance_cluster.sh. These will receive prompt attention!<br></blockquote><div><br></div><div>Great! This is the only issue I've found that I'm really worried about, since we do want the ability to rebalance queues across our cluster when adjusting cluster topology, so we can properly distribute workload amongst the nodes. Everything else has pretty simple workarounds.<br>
</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Incidentally a strong contributing factor appears to be backgrounding "rabbitmqctl set_policy" and thus running lots of copies simultaneously. So a workaround is to only run one copy at once.</blockquote><br></div>
<div class="gmail_quote">Yeah, I did that intentionally in my test script so it'd be easier to reproduce the issue. I was actually doing this sequentially when I initially wrote my script to rebalance queues across the cluster, and still encountered the issue, it just took a lot more cluster operations before it happened. Presumably if you fix the parallel case, (by making the API handler block when running API reqs hits an upper bound or something?), it'll automatically solve any issues with the sequential case.<br>
<br>Graeme<br><br></div></div></div>