<div class="gmail_quote"><div>Hi Simon,</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"></div>
I think Matthias meant to disable management altogether. Elsewhere you mentioned that the management overview page was very slow; this is a known issue with large numbers of queues / connections (which we are working on) so don't take that as evidence of the cluster grinding to a halt. And management does have some per-object overheads of its own; it would be good to eliminate them.<br>
</blockquote><div><br></div><div>I'll try disabling management altogether and see what happens then - I'll post my results. Although if you do that, it gets annoyingly difficult to see what's going on. </div><div>
<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
So what goes wrong when you start this many connections? (and queues - I assume you still have one queue per connection?) Do you run out of memory, does CPU reach 100%? "rabbitmqctl status" can tell you more about where memory is going.<br>
</blockquote><div><br></div><div>Both CPU and RAM seem to be increasing at a stable pace and are nowhere near the limits. At some point one node decides to lock up (=no new connections), then another, then another... It would look very much like running out of file handles / sockets, but those are nowhere near the limits as well (30k out of 100k limit for open sockets) and there's nothing in the logs. I'll do some more tests and try to play around with logging to get some more out of this, maybe it's AWS imposing some secret limits.</div>
<div><br></div><div><br></div><div>Regards,</div><div>Roman</div><div><br></div><div><br></div><div><br></div><div> </div></div>