[rabbitmq-discuss] Priorities support in RabbitMQ
Michal Chruszcz
mchruszcz at gmail.com
Tue Sep 1 13:32:39 BST 2009
On Aug 31, 2009, at 2:19 PM, Holger Hoffstätte wrote:
> Michal Chruszcz wrote:
>> control order of message processing. Say I have distributed
>> asynchronous data processing service, which uses RabbitMQ for
>> queueing
>> tasks, and would like to prioritize them depending on data volume
>> (tiny packages are processed first). Priorities seemed to be perfect
>> solution, but unfortunately later I realized it's not supported yet.
>
> It is a common misconception that priorities alone can be used to do
> this
> "properly" - it will only work well if you are accidentally lucky with
> your ingress patterns or really don't care about a total lack of
> predictable processing latency for larger messages. Since Rabbit
> does not
> page to disk yet, preferring small messages might even make it run
> out of
> memory *sooner* as more and more large messages are held back.
In general you're absolutely right, but this does not apply to my
case, since messages themselves don't contain the data to be
processed. In fact they are only "pointers" to the data, whose size is
constant, thus postponing processing of messages pointing to larger
amounts of data doesn't increase memory usage. In contrary it could
even lower it, since total throughput would increase.
> Do not
> underestimate the effectiveness of randomized workloads. :)
I am not. However a well thought out deterministic algorithm usually
is more effective. :-)
> Explicitly moding this with multiple queues and multiple consumers
> is much
> more likely to result in better (both predictable and controllable)
> scalability and a well-balanced throughput for both small and large
> messages. Spin up multiple consumers for small messages (keeping those
> cores busy and even hopefully reducing per-message latency), have one
> low-priority consumer chew on larger tasks in the background or even
> on
> another machine. It also gives you the option of having different
> durability/persistence options for different priorities.
Again, this is a good idea in general, but a little bit awkward in my
situation. I'm using celery (http://celeryproject.org/) as consumers,
whose role is performing background computation, and the idea behind
it is automatic distribution of tasks between nodes. While it is
probably possible to define custom routing of tasks it's against the
architecture.
Best regards,
Michal Chruszcz
mchruszcz at gmail.com
mobile: +48 607 620 771
phone: +48 22 849 30 26
More information about the rabbitmq-discuss
mailing list