[rabbitmq-discuss] Priorities support in RabbitMQ

Tue Sep 1 13:32:39 BST 2009

On Aug 31, 2009, at 2:19 PM, Holger Hoffstätte wrote:

> Michal Chruszcz wrote:
>> control order of message processing. Say I have distributed
>> asynchronous data processing service, which uses RabbitMQ for  
>> queueing
>> tasks, and would like to prioritize them depending on  data volume
>> (tiny packages are processed first). Priorities seemed to be perfect
>> solution, but unfortunately later I realized it's not supported yet.
>
> It is a common misconception that priorities alone can be used to do  
> this
> "properly" - it will only work well if you are accidentally lucky with
> your ingress patterns or really don't care about a total lack of
> predictable processing latency for larger messages. Since Rabbit  
> does not
> page to disk yet, preferring small messages might even make it run  
> out of
> memory *sooner* as more and more large messages are held back.

In general you're absolutely right, but this does not apply to my  
case, since messages themselves don't contain the data to be  
processed. In fact they are only "pointers" to the data, whose size is  
constant, thus postponing processing of messages pointing to larger  
amounts of data doesn't increase memory usage. In contrary it could  
even lower it, since total throughput would increase.

> Do not
> underestimate the effectiveness of randomized workloads. :)

I am not. However a well thought out deterministic algorithm usually  
is more effective. :-)

> Explicitly moding this with multiple queues and multiple consumers  
> is much
> more likely to result in better (both predictable and controllable)
> scalability and a well-balanced throughput for both small and large
> messages. Spin up multiple consumers for small messages (keeping those
> cores busy and even hopefully reducing per-message latency), have one
> low-priority consumer chew on larger tasks in the background or even  
> on
> another machine. It also gives you the option of having different
> durability/persistence options for different priorities.

Again, this is a good idea in general, but a little bit awkward in my  
situation. I'm using celery (http://celeryproject.org/) as consumers,  
whose role is performing background computation, and the idea behind  
it is automatic distribution of tasks between nodes. While it is  
probably possible to define custom routing of tasks it's against the  
architecture.

Best regards,
Michal Chruszcz
mchruszcz at gmail.com
mobile: +48 607 620 771
phone: +48 22 849 30 26