[rabbitmq-discuss] Problem concurrency isue with rabbitmq (branch bug21673)
scottblanc at gmail.com
Mon Feb 15 20:36:12 GMT 2010
Hi Matthew --
I'm not sure that's the problem. Things were running fine for a day or so
and then all the sudden we found our ruby and java clients could not longer
pull things out of the queue. I have a little multi-threaded performance
tool I use which just writes out some messages and reads back the same
messages on a single queue and I have never had a problem with it before.
When I pointed it at the machine that had high load and which the clients
weren't able to connect to, it was not able to receive any messages at all
even after sitting there for a minute. When I pointed it at the other
machine it was working fine. Interestingly, after some time the 2nd machine
which had similarly high load (we round robin queue tasks across two
machines) stopped working after a few hours as well.
Let me know what other info I can give you to help diagnose this.
On Mon, Feb 15, 2010 at 12:10 PM, Matthew Sackman <matthew at lshift.net>wrote:
> On Mon, Feb 15, 2010 at 11:40:20AM -0800, scott w wrote:
> > We are also noticing that rabbitmq branch bug 21673 seems to hang after
> > placing a severe amount of load on it. I don't know erlang but is there
> > way I can do a thread dump to send to you guys to help determine what the
> > problem might be?
> Let me guess. You send an awful lot of messages into RabbitMQ, and then
> you start reading from the queue, and you find it takes some time before
> the first message pops out?
> If so, that's entirely expected. What you're seeing is the fact that
> writes, being async, allow a large backlog of pending disk activity to
> grow, bounded by available RAM. When a read occurs, except in some
> special circumstances, you have no choice but to let those writes hit
> disk before you can read them - you can't risk a read overtaking a
> However, that said, I'm now thinking that there are some situations in
> which allowing a read to have a higher priority than a write would be
> safe... I shall have a play. If you could confirm that this is the
> scenario you're seeing that'd be great. In general, you should be able
> to identify this situation by a high(ish) load average, but no CPU
> activity and your disks being burnt to a cinder. A 15kRPM SAS Fiber
> Channel 8-disk RAID 0 array /may/ help here :P
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rabbitmq-discuss