[rabbitmq-discuss] distributed cluster questions about performance
姚仁捷
baniu.yao at gmail.com
Wed Mar 13 03:49:31 GMT 2013
I'm using 6 servers to make a cluster and they are all disk nodes. I use
rabbitmq for collecting log file for our website. Now at the peak hour, the
publish rate is about 30k message per second. There are 2 main
consumers(hdfs and elasticsearch) and each one need to handle all message,
so the delivery rate hit about 60k per second.
In my scenario, a single server can hold 10k delivery rate and I use 6 node
to load balance the pressure. My solution is that I created 2 queues on
each node. Each message is with a random routing-key(something like
message.0, message.1, etc) to distribute the pressure to every node.
What confused me is:
1. All message send to one node. Should I use a HA Proxy to load balance
this publish pressure?
2. Is there any performance difference between Durable Queues and
Transient Queues?
3. Is there any performance difference between Memory Node and Disk
Node? What I know is the difference between memory node and disk node is
only about the meta data such as queue configuration.
4. How can I import the performance in publish and delivery codes? I've
researched and I know several methods:
- disable the confirm mechanism(in publish codes?)
- enable HiPE(I've done that and it helped a lot)
5. For example, input is 1w mps(message per second), there are two
consumers to consume all message. Then the output is 2w mps. If my server
can handle 1w mps, I need two server to handle the 2w-mps-pressure. Now a
new consumer need to consume all message, too. As a result, output hits 3w
mps, so I need another one more server. For a conclusion, one more consumer
for all message, one more server?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130312/7e769327/attachment.htm>
More information about the rabbitmq-discuss
mailing list