Hi Matthew --<br><br>I'm not sure that's the problem. Things were running fine for a day or so and then all the sudden we found our ruby and java clients could not longer pull things out of the queue. I have a little multi-threaded performance tool I use which just writes out some messages and reads back the same messages on a single queue and I have never had a problem with it before. When I pointed it at the machine that had high load and which the clients weren't able to connect to, it was not able to receive any messages at all even after sitting there for a minute. When I pointed it at the other machine it was working fine. Interestingly, after some time the 2nd machine which had similarly high load (we round robin queue tasks across two machines) stopped working after a few hours as well. <br>
<br>Let me know what other info I can give you to help diagnose this.<br><br>thanks,<br>Scott <br><br><div class="gmail_quote">On Mon, Feb 15, 2010 at 12:10 PM, Matthew Sackman <span dir="ltr"><<a href="mailto:matthew@lshift.net">matthew@lshift.net</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div><div></div><div class="h5">On Mon, Feb 15, 2010 at 11:40:20AM -0800, scott w wrote:<br>
> We are also noticing that rabbitmq branch bug 21673 seems to hang after<br>
> placing a severe amount of load on it. I don't know erlang but is there some<br>
> way I can do a thread dump to send to you guys to help determine what the<br>
> problem might be?<br>
<br>
</div></div>Let me guess. You send an awful lot of messages into RabbitMQ, and then<br>
you start reading from the queue, and you find it takes some time before<br>
the first message pops out?<br>
<br>
If so, that's entirely expected. What you're seeing is the fact that<br>
writes, being async, allow a large backlog of pending disk activity to<br>
grow, bounded by available RAM. When a read occurs, except in some<br>
special circumstances, you have no choice but to let those writes hit<br>
disk before you can read them - you can't risk a read overtaking a<br>
write.<br>
<br>
However, that said, I'm now thinking that there are some situations in<br>
which allowing a read to have a higher priority than a write would be<br>
safe... I shall have a play. If you could confirm that this is the<br>
scenario you're seeing that'd be great. In general, you should be able<br>
to identify this situation by a high(ish) load average, but no CPU<br>
activity and your disks being burnt to a cinder. A 15kRPM SAS Fiber<br>
Channel 8-disk RAID 0 array /may/ help here :P<br>
<br>
Matthew<br>
<br>
_______________________________________________<br>
rabbitmq-discuss mailing list<br>
<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>
<a href="http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>
</blockquote></div><br>