[rabbitmq-discuss] Single vs. Multiple Queue Topology Performance

Ilya Volodarsky ilya at segment.io
Fri Nov 4 19:05:24 GMT 2011


Hey Jerry,

Thanks for your very detailed answer. Regarding availability and
durability, I definitely plan to use the active/active mirror queue
approach on the queue. We hope to have the broker be in a "good"
steady state, in that we'll have enough consumers to empty the queue
at hopefully the same rate as they come in. The queue is mostly there
so that we can upgrade the consumers without losing any of the tasks,
and in case there is a strong spike in traffic, Rabbit can buffer the
messages for a few seconds.

I have a follow up question if you don't mind:

You said "it shouldn't be a bottleneck unless your production rate
really overwhelms what a single queue process can handle". If there a
process per queue, aren't we losing the advantages of multiple cores
helping in the dequeuing to consumers? Wouldn't that be the advantage
of having multiple queues?

The point about premature optimization is well taken, I think it would
be ideal to have a single queue. Just trying to figure out if that
puts us at a massive disadvantage since this queue could potentially
get big at times. Thanks very much!

Ilya

On Nov 4, 12:52 pm, Jerry Kuch <jer... at vmware.com> wrote:
> Hi, Ilya...
>
> It really depends on what you see as the likely "steady state" Rabbit will
> exist in, given your workload.  A good rule of thumb is that "a happy Rabbit
> is an empty Rabbit," i.e. the broker is happiest when consumers are pulling
> messages out of it at a rate reasonably well matched to the rate at which
> producers are pumping things into it (although of course the broker has
> mechanisms for coping with imbalances).
>
> With respect to using one big queue for all of your work items vs separating
> things into multiple queues, there are maybe a couple of things to think
> about:
>
> First, you can add multiple consumers to a single queue.  Should you find
> that your customers are producing jobs at the front end of your web site
> at a higher than anticipated rate, you can always add more back end workers
> to consume those jobs from the queue and parallelize their processing (assuming
> of course the back end tasks are parallelizable).  If you somehow reach the
> maximum throughput that a single queue can provide, then you could consider
> adding additional queues and spraying your tasks across them.
>
> The second issue to consider is that of availability.  Unless you're using the
> new active/active HA system that debuted in Rabbit 2.6.x, each queue's contents
> exist only on a single Rabbit node.  Thus, if that node becomes unavailable that
> queue's messages are unavailable until it comes back up.  In this case though,
> the solution is probably not to come up with a complex multi-queue solution in
> your application but instead to use Rabbit's HA features, either the old style
> active/passive HA (where shared storage stands behind each active/passive pair)
> or the new active/active HA (where you can specify that a queue be replicated
> on multiple nodes in a cluster, as well as how many replicas you want).
>
> So bottom line:  because you can put many consumers on a single queue, it
> shouldn't be a bottleneck unless your production rate really overwhelms
> what a single queue process can handle.  If you're worried about availability,
> using a built-in Rabbit HA feature is probably the way to go and will save you
> having to re-implement that in your application.  Finally, if you are going to
> use multiple queues and aren't in the situation of being forced to do so
> by straight-line throughput limitations, introduce new queues based on the logical
> structure of your app and its semantics for handling tasks, rather than as
> a possibly premature performance optimization.
>
> Does that make sense?  Please feel free to continue the discussion if there
> are still things you're wondering about...
>
> Best regards,
> Jerry
>
>
>
>
>
>
>
> ----- Original Message -----
> From: "Ilya Volodarsky" <i... at segment.io>
> To: rabbitmq-disc... at lists.rabbitmq.com
> Sent: Friday, November 4, 2011 8:52:42 AM
> Subject: [rabbitmq-discuss] Single vs. Multiple Queue Topology Performance
>
> Hi,
>
> We have an high rate API that has three possible tasks. Web servers
> will receive the client's tasks, validate the input, and queue them in
> Rabbit. Then, a service will later dequeue these tasks and process
> them. I was wondering if putting all these messages into one queue has
> any performance or durability repercussions. An alternative is to
> separate the tasks into multiple queues based on the client API key,
> or on the API verb. Is there any preferred method, or does it not
> matter?
> Thanks,
>
> Ilya
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-disc... at lists.rabbitmq.comhttps://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-disc... at lists.rabbitmq.comhttps://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss


More information about the rabbitmq-discuss mailing list