[rabbitmq-discuss] Load Balanced Consumers?
Michael Bridgen
mikeb at rabbitmq.com
Tue Aug 16 15:36:18 BST 2011
> In all the material I've read on clustering and load balancing the
> setup I always see is publishers publishing through a load balancer
> while consumers consume all the nodes on the cluster.
> Is it possible to place the consumers behind a load balancer? I'm
> just asking, we plan on trying it on Monday with our cluster at work.
> The setup we're thinking of is one balancer for publishers, two for
> consumers (which consumers will consume on both) and divide the
> cluster into segments that the consumer balancers forward to.
Load-balancing for publishers is for handling high throughput,
the theory being that more connections can be opened and more messages
processed. Routing is done by channel processes, so that will to some
extent[1] scale along with the number of connections.
The overall scaling behaviour is probably[2] mostly dependent on the
routing topology; i.e., to how many queues each message is routed. If
messages are being fanned out to many queues, that will of course
mitigate the benefit of handling many publishers and incoming messages
at once.
You're right that no-one has really gone into load-balancing consumers
(or at least, has not reported doing so here). Possibly this is because
how queues and consumers are deployed is more tied in with application
logic (e.g., routing) than publishing is -- i.e., it's different for
everyone, and no one scheme works across the board. Publishing isn't
stateful but consuming is.
From a throughput point of view, taking a single use case -- let's say
publishing to a single queue and consuming from it -- I guess the idea
would be to consume on more than one connection, on the basis that the
queue would load-balance across those connections and each connection
would be on a different node. I'm not sure how much this would improve
performance, since the queue (located on one node) still has to deliver
messages across nodes. See [2] though.
It sounds from your brief description you are more interested in
failover -- e.g., if a node fails, another node will continue to deliver
messages. Since queues are located at a particular node, this won't work
in that mode, presently[3]; unless you are, somewhere, prepared to
duplicate messages (e.g., with a fanout exchange and a queue bound to it
from each segment) and deduplicate at the clients (and even then, it
doesn't quite work, as the round-robin at each queue wouldn't be the
same). But perhaps I have misunderstood what you're trying to achieve.
[1] To some extent because it does involve database reads and other
overheads.
[2] Nothing beats actually measuring.
[3] Replicated queues are coming soon.
> Is this an ideal solution? The only problem comes when the nodes
> that both connect to go down at the same time but I am putting money
> on that not happening. We'll see. :)
>
> Thanks, James _______________________________________________
> rabbitmq-discuss mailing list rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
More information about the rabbitmq-discuss
mailing list