<div dir="ltr">Hi,<div><br></div><div>Assume we have two hosts, red and green, each running 10 consumers on the same queue. Inside `rabbit_amqqueue_process` these consumers will be placed in a standard Erlang `queue` module and wait for jobs to arrive. If all the 20 consumers are busy all the time, there is no problems here at all. But if there are more consumers than messages in the queue the consumers will sit idle and wait.</div>
<div><br></div><div>There are a couple of reasons to have idle workers, the most important being you want to handle sudden message spikes for instance. Now, we would like the host consumers to be interleaved in the queue:</div>
<div><br></div><div>RGRGRGRG,...</div><div><br></div><div>But in practice, since it is a queue, this may not be the case. We could have something along the lines of</div><div><br></div><div>RRRRRRRGGGGGGG,...</div><div><br>
</div><div>which means that if requests arrive slowly, they will only be processed by the Red host for a while and then only be processed by the Green host for a while. If the hosts are different in nature, it is very likely that over time, there will be clusters formed in the queue like this.</div>
<div><br></div><div>A way to alleviate this is to check for the following conditions whenever we have "run" the queue:</div><div><br></div><div>1. There are no more messages (queue is empty)</div><div>2. There are active consumers waiting (active_consumers is not empty)<br clear="all">
<div><br></div><div style>When this happens, we pick a random consumer in the queue and move him to the front. Over time, this "shuffles" the queue into a random order. It is also not going to cost anything on the critical path since we only do it when we have an empty queue and excess workers. And we are going to do very little work unless the queue has a behaviour where it empties often in which case you get full random distribution on the consumers with this scheme.</div>
<div style><br></div><div style>The background for the proposal is that Round-robin distribution of messages often tend to bad behaviour over time. By adding a bit of randomness to the process, we automatically alleviate a number of determinism-problems and get better distribution of messages over consumers. One could also imagine different distribution schemes, but those will be more expensive in practice compared to this proposal, which should only have a cost when the queue is not under heavy load.</div>
<div style><br></div><div style>* Did I miss anything?</div><div style>* Is this a good or bad idea? And why?</div><div style>* Do we break any rules w.r.t. AMQP by implementing this?</div><div style>* Is priority on the queue going to be harder to implement? (I don't think so, but...)</div>
<div style> </div>-- <br>J.
</div></div>