[rabbitmq-discuss] questions about RabbitMQ linear scalability

Junius Wang wangjunbo924 at gmail.com
Fri Aug 23 03:14:54 BST 2013


Thanks your guys.  It really helps us a lot understanding the mirror queues.


In our test scenarios, we actually always set an mirrored queues with
ha-mode "exactly" on 2 nodes.  I'd like to provide some details:
Before every test ( after a new RabbitMQ node added/removed to the cluster):
1).we reset the policy to "exactly" on 2 nodes. 2). declared 10 queues using
rabbitmqctl and force them distributed evenly on all nodes.  e.g.in a 2 node
cluster,5 queues resides on rabbit1, another 5 resides on rabbit2. 3-3-4 for
3 node cluster ,3-3-2-2 for 4 node cluster.   3) declare a 'topic" exchange
4) binding the 10 queues to the exchange with 10 different routing keys.
Publishers publish messages with the 10 routing keys by turns. Thus all
queues receive the same number of messages.  
Is the number of queues large enough?  we can declare more queues in test
but I don't think there will be too many queues in production.  

publishers connect to the cluster via load balancer(AWS ELB) as well as
consumers, they don't know to which node they are connect.  So we may not
decrease the intra-cluster traffic. But  we can try some high bandwidth
instances. Does this help?
 Another question is that we will have a queue with lots of messages
handled, perhaps 5000/sec(size of 2K), which is about 90% of the total
messages handled by the cluster.  It's hard for us to split it into multiple
queues, it's really a big design change which we try to avoid. From your
comments, even if messages are synchronized on only 2 nodes , hardly we can
get linear scaling , right?  If so, is the network traffic the major factor
impacting the performance?  If so ,we can try some high bandwidth instances
or using AWS instance group. Hopes that will get better performance.


-----Original Message-----
From: rabbitmq-discuss-bounces at lists.rabbitmq.com
[mailto:rabbitmq-discuss-bounces at lists.rabbitmq.com] On Behalf Of Michael
Klishin
Sent: Thursday, August 22, 2013 4:56 PM
To: Discussions about RabbitMQ
Subject: Re: [rabbitmq-discuss] questions about RabbitMQ linear scalability

Junius Wang:

> 1.       The throughput of two node cluster is 50%-60% worse than a single
node broker.

With mirroring, messages have to be copied to multiple (N or, depending on
configuration, even all) nodes in the cluster. That obviously takes more
time than not copying anything.

> 2.       Adding more node did have improvement on throughput but we only
got 25% improvement(throughput of 3 node cluster is 25% better than 2 node
cluster. 4 node cluster is 25% better than 3 node cluster too). What we
expected is a 45-degree line, that means when 2 nodes are used the
throughput is double. With 3 nodes, then triple.

You are not providing any details about your workload but the way you extend
capacity with RabbitMQ is by adding nodes and using more queues. Queue
mirroring is an HA feature, which means copying more data across the
cluster.

Maximum throughput and highest availability are largely at odds with each
other, so you need to

1. Use multiple queues (and the number should grow with the number of nodes)
2. Choose what queues to mirror. Likely not all queues are equally important
to your system, so you can make some of them non-HA.
3. Configure mirroring to, say, 2 nodes instead of "all".
4. If you know what node is the master for a particular queue (e.g. was
declared on that node),
     make your clients connect there. It will decrease intra-cluster
traffic.
--
MK




More information about the rabbitmq-discuss mailing list