Thank you for the long explanation Matthew, you had already explained this to me once and I have a vague understanding of how the memory allocation algorithm works at a high level. I don&#39;t think what we are seeing is normal though and can be explained by this. From what I could gather looking at the files written by rabbit they contain messages addressed to queues that are part of the &quot;active&quot; group. Also even if rabbit was to write every single message it gets through all the queues it would not result in that many writes. We don&#39;t have a throughput of 6000 msg/sec which is the number of writes iostat reports (spikes at 17,000/s). Our rate is more like 20 msg/sec across all queues.<div>

<br></div><div>We&#39;ll leave it running as is but I&#39;m worried that like in the past it will keep getting worst until the machine simply can&#39;t handle the load.</div><div><br></div><div>--</div><div>Raphael.<br><br>

<div class="gmail_quote">On Wed, Oct 26, 2011 at 4:01 PM, Matthew Sackman <span dir="ltr">&lt;<a href="mailto:matthew@rabbitmq.com">matthew@rabbitmq.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

Raphael,<br>

<br>

Rabbit uses a somewhat sophisticated mechanism to allocate RAM to<br>

queues. The allocation of RAM to queues is extremely difficult to do -<br>

what are the criteria that you should use to decide to prioritise one<br>

queue over another? Furthermore, if the same msg ends up in multiple<br>

queues then that msg will exist just once in RAM but be shared by all<br>

those queues. So it&#39;s actually nonsensical to ask how much memory a<br>

queue is using.<br>

<br>

The approach we adopt results in Rabbit telling all the queues on a node<br>

to achieve the same *duration* in RAM. I.e. if that duration is, for<br>

example, 10 seconds, then every queue should ensure that given their<br>

individual current ingress and egress rates, the number of messages they<br>

hold in RAM does not represent more than 10 seconds. Messages beyond<br>

that count will be written to disk.<br>

<br>

The effect of this strategy is that very fast queues find that an awful<br>

lot of messages make up that duration, and very slow queues find that<br>

very few messages make up that duration. Thus very slow queues can end<br>

up with a &quot;target_ram_count&quot; of 0 or small numbers, whilst fast queues<br>

have much higher &quot;target_ram_counts&quot;.<br>

<br>

This is desireable because a slow queue is far more likely to withstand<br>

having to write everything to disk and read from disk than a fast queue<br>

can. As I have blogged about,<br>

( <a href="http://www.rabbitmq.com/blog/2011/09/24/sizing-your-rabbits/" target="_blank">http://www.rabbitmq.com/blog/2011/09/24/sizing-your-rabbits/</a> ), as a<br>

queue gets longer, its CPU-per-msg goes up. As it starts having to write<br>

to disk, the CPU-per-msg goes up even faster. Plus disks have much lower<br>

bandwidth than RAM. Thus a fast queue is likely to catastrophically<br>

consume CPU and disk bandwidth if it starts getting forced out to disk.<br>

But for a slow queue, this is absolutely fine.<br>

<br>

Thus in the case of a slow queue receiving a lot rate of messages, what<br>

is the problem with Rabbit deciding to write these out to disk? It&#39;s<br>

doing it deliberately because it believes that queue can cope with being<br>

pushed to disk (which it seems to be able to - ingress and egress rates<br>

match), and in order to free up vital resources (RAM) for other queues<br>

which can not cope with being sent to disk.<br>

<br>

Rabbit works hard to avoid crowbars - sudden events where it decides to<br>

write out millions of messages. In order to avoid this on a queue that<br>

slowly grows and forces RAM usage to head towards the limit, Rabbit<br>

starts writing out to disk very early on so that the transition to<br>

fully-on-disk operation is as smooth and unnoticable as possible. Thus<br>

even when Rabbit is some way away from its memory limit, it can still<br>

choose to start to push msgs out to disk so to reduce the chances of<br>

hitting the memory limit and then realising that there are millions of<br>

msgs that need writing out causing a massive burst to the disk which&#39;ll<br>

disrupt performance substantially.<br>

<br>

I hope that helps explain a little more.<br>

<font color="#888888"><br>

Matthew<br>

</font><div><div></div><div class="h5">_______________________________________________<br>

rabbitmq-discuss mailing list<br>

<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>

<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>

</div></div></blockquote></div><br></div>