Thank you for the long explanation Matthew, you had already explained this to me once and I have a vague understanding of how the memory allocation algorithm works at a high level. I don't think what we are seeing is normal though and can be explained by this. From what I could gather looking at the files written by rabbit they contain messages addressed to queues that are part of the "active" group. Also even if rabbit was to write every single message it gets through all the queues it would not result in that many writes. We don't have a throughput of 6000 msg/sec which is the number of writes iostat reports (spikes at 17,000/s). Our rate is more like 20 msg/sec across all queues.<div>
<br></div><div>We'll leave it running as is but I'm worried that like in the past it will keep getting worst until the machine simply can't handle the load.</div><div><br></div><div>--</div><div>Raphael.<br><br>
<div class="gmail_quote">On Wed, Oct 26, 2011 at 4:01 PM, Matthew Sackman <span dir="ltr"><<a href="mailto:matthew@rabbitmq.com">matthew@rabbitmq.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Raphael,<br>
<br>
Rabbit uses a somewhat sophisticated mechanism to allocate RAM to<br>
queues. The allocation of RAM to queues is extremely difficult to do -<br>
what are the criteria that you should use to decide to prioritise one<br>
queue over another? Furthermore, if the same msg ends up in multiple<br>
queues then that msg will exist just once in RAM but be shared by all<br>
those queues. So it's actually nonsensical to ask how much memory a<br>
queue is using.<br>
<br>
The approach we adopt results in Rabbit telling all the queues on a node<br>
to achieve the same *duration* in RAM. I.e. if that duration is, for<br>
example, 10 seconds, then every queue should ensure that given their<br>
individual current ingress and egress rates, the number of messages they<br>
hold in RAM does not represent more than 10 seconds. Messages beyond<br>
that count will be written to disk.<br>
<br>
The effect of this strategy is that very fast queues find that an awful<br>
lot of messages make up that duration, and very slow queues find that<br>
very few messages make up that duration. Thus very slow queues can end<br>
up with a "target_ram_count" of 0 or small numbers, whilst fast queues<br>
have much higher "target_ram_counts".<br>
<br>
This is desireable because a slow queue is far more likely to withstand<br>
having to write everything to disk and read from disk than a fast queue<br>
can. As I have blogged about,<br>
( <a href="http://www.rabbitmq.com/blog/2011/09/24/sizing-your-rabbits/" target="_blank">http://www.rabbitmq.com/blog/2011/09/24/sizing-your-rabbits/</a> ), as a<br>
queue gets longer, its CPU-per-msg goes up. As it starts having to write<br>
to disk, the CPU-per-msg goes up even faster. Plus disks have much lower<br>
bandwidth than RAM. Thus a fast queue is likely to catastrophically<br>
consume CPU and disk bandwidth if it starts getting forced out to disk.<br>
But for a slow queue, this is absolutely fine.<br>
<br>
Thus in the case of a slow queue receiving a lot rate of messages, what<br>
is the problem with Rabbit deciding to write these out to disk? It's<br>
doing it deliberately because it believes that queue can cope with being<br>
pushed to disk (which it seems to be able to - ingress and egress rates<br>
match), and in order to free up vital resources (RAM) for other queues<br>
which can not cope with being sent to disk.<br>
<br>
Rabbit works hard to avoid crowbars - sudden events where it decides to<br>
write out millions of messages. In order to avoid this on a queue that<br>
slowly grows and forces RAM usage to head towards the limit, Rabbit<br>
starts writing out to disk very early on so that the transition to<br>
fully-on-disk operation is as smooth and unnoticable as possible. Thus<br>
even when Rabbit is some way away from its memory limit, it can still<br>
choose to start to push msgs out to disk so to reduce the chances of<br>
hitting the memory limit and then realising that there are millions of<br>
msgs that need writing out causing a massive burst to the disk which'll<br>
disrupt performance substantially.<br>
<br>
I hope that helps explain a little more.<br>
<font color="#888888"><br>
Matthew<br>
</font><div><div></div><div class="h5">_______________________________________________<br>
rabbitmq-discuss mailing list<br>
<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>
<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>
</div></div></blockquote></div><br></div>