[rabbitmq-discuss] High publish rates, mirrored queues, high watermark...
Simon MacMullen
simon at rabbitmq.com
Thu Jan 12 17:23:26 GMT 2012
On 12/01/12 16:41, Jim Myhrberg wrote:
> Hi, I was hoping you guys would be able to answer some quick questions
> about an issue we're having.
Hi Jim.
> We've recently started using mirrored queues across two RabbitMQ
> instances on two servers, and it's working really well except for when
> publish rates are really high for a longer period of time. At that point
> RabbitMQ starts eating a lot of memory on the master node until it hits
> our high watermark limit of 18.9GB, at which point things basically stop
> working.
>
> My theory of what is happening is that when RabbitMQ can't process
> incoming publish requests fast enough, it caches the raw TCP packets in
> memory until it can deal with them and direct it to the queue(s) in
> question.
This is approximately true. In fact RabbitMQ can be viewed as a
pipeline, when publishing over AMQP it would look like:
(OS TCP buffer) -> [reader] -> [channel] -> [queue] -> etc...
Each of the square bracketed things is an Erlang process. The reader
(aka connection) disassembles packets into AMQP methods (most frequently
basic.publish). The channel applies those methods (in the case of basic
publish by making routing decisions, checking security, etc). And then
there's the queue.
The Erlang processes communicate by message passing - so each process
has a mailbox which contains messages which have been sent on by the
previous stage but not yet handled.
The OS manages the size of the TCP buffer and in general it won't be
able to contain *too* huge a number of messages, but the process
mailboxes can grow arbitrarily large when you publish "too fast" for the
queue to keep up. And that's where most of your memory is going I'll bet.
The theory was that more and more memory will get used up by the process
mailboxes and eventually the memory alarm will go off, which will cause
all the readers to block while Rabbit sorts itself out. However, by this
stage there can be a huge amount of work to do (ironically the more
memory you have the worse off you are), and everything grinds to a halt
for potentially a very long time.
The good news is that we're working (right now, by lucky coincidence) on
a system of internal flow control that will prevent the process
mailboxes from becoming too big. This means that a publisher which is
publishing "too fast" for a queue to handle will get pushed back on much
more rapidly (within a second or so rather than after memory fills up).
This feature is likely to be in the next release, but if you can't wait
until then you can improve matters considerably by using confirms and
only allowing each publisher to have (e.g.) 1000 unconfirmed messages.
You won't get a confirm back until the message has hit the queue, so you
bound the number of in-flight messages.
Arguably you should be doing that anyway - if your messages are
important enough that they need to go to mirrored queues they are
probably important enough that the publisher needs confirmation that
they've been accepted by the broker.
> And I'm assuming the fact that we are using mirrored queues
> adds overhead in dealing with a publish request as it needs to be synced
> to the other node(s). Am I right?
Yes, very much so. Mirrored queues are noticeably slower than
non-mirrored ones due to the extra work involved. And also they're newer
and not as heavily optimised.
> My theory is based on the fact that we could deal with much higher
> publish rates without and problems before we switched to mirrored
> queues. And the fact that we've had a few points when memory usage has
> been excessive we've seen queues that were completely empty according to
> the management plugin, but our workers were still processing +1000
> messages/sec for a good hour, all the while the management plugin said
> no incoming messages, and +1000 msg/s get/acks.
Yes, that's exactly consistent with messages being backed up in process
mailboxes.
Cheers, Simon
--
Simon MacMullen
RabbitMQ, VMware
More information about the rabbitmq-discuss
mailing list