[rabbitmq-discuss] Load Balanced Consumers?

Tue Aug 16 15:36:18 BST 2011

> In all the material I've read on clustering and load balancing the
> setup I always see is publishers publishing through a load balancer
> while consumers consume all the nodes on the cluster.

> Is it possible to place the consumers behind a load balancer? I'm
> just asking, we plan on trying it on Monday with our cluster at work.
> The setup we're thinking of is one balancer for publishers, two for
> consumers (which consumers will consume on both) and divide the
> cluster into segments that the consumer balancers forward to.

Load-balancing for publishers is for handling high throughput,
the theory being that more connections can be opened and more messages
processed.  Routing is done by channel processes, so that will to some 
extent[1] scale along with the number of connections.

The overall scaling behaviour is probably[2] mostly dependent on the
routing topology; i.e., to how many queues each message is routed. If
messages are being fanned out to many queues, that will of course
mitigate the benefit of handling many publishers and incoming messages 
at once.

You're right that no-one has really gone into load-balancing consumers 
(or at least, has not reported doing so here). Possibly this is because 
how queues and consumers are deployed is more tied in with application 
logic (e.g., routing) than publishing is -- i.e., it's different for 
everyone, and no one scheme works across the board. Publishing isn't 
stateful but consuming is.

 From a throughput point of view, taking a single use case -- let's say 
publishing to a single queue and consuming from it -- I guess the idea 
would be to consume on more than one connection, on the basis that the 
queue would load-balance across those connections and each connection 
would be on a different node. I'm not sure how much this would improve 
performance, since the queue (located on one node) still has to deliver 
messages across nodes. See [2] though.

It sounds from your brief description you are more interested in 
failover -- e.g., if a node fails, another node will continue to deliver 
messages. Since queues are located at a particular node, this won't work 
in that mode, presently[3]; unless you are, somewhere, prepared to 
duplicate messages (e.g., with a fanout exchange and a queue bound to it 
from each segment) and deduplicate at the clients (and even then, it 
doesn't quite work, as the round-robin at each queue wouldn't be the 
same). But perhaps I have misunderstood what you're trying to achieve.

[1] To some extent because it does involve database reads and other
overheads.

[2] Nothing beats actually measuring.

[3] Replicated queues are coming soon.

> Is this an ideal solution? The only problem comes when the nodes
> that both connect to go down at the same time but I am putting money
> on that not happening. We'll see. :)
>
> Thanks, James _______________________________________________
> rabbitmq-discuss mailing list rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss