[rabbitmq-discuss] distributed cluster questions about performance
baniu.yao at gmail.com
Wed Mar 13 03:49:31 GMT 2013
I'm using 6 servers to make a cluster and they are all disk nodes. I use
rabbitmq for collecting log file for our website. Now at the peak hour, the
publish rate is about 30k message per second. There are 2 main
consumers(hdfs and elasticsearch) and each one need to handle all message,
so the delivery rate hit about 60k per second.
In my scenario, a single server can hold 10k delivery rate and I use 6 node
to load balance the pressure. My solution is that I created 2 queues on
each node. Each message is with a random routing-key(something like
message.0, message.1, etc) to distribute the pressure to every node.
What confused me is:
1. All message send to one node. Should I use a HA Proxy to load balance
this publish pressure?
2. Is there any performance difference between Durable Queues and
3. Is there any performance difference between Memory Node and Disk
Node? What I know is the difference between memory node and disk node is
only about the meta data such as queue configuration.
4. How can I import the performance in publish and delivery codes? I've
researched and I know several methods:
- disable the confirm mechanism(in publish codes?)
- enable HiPE(I've done that and it helped a lot)
5. For example, input is 1w mps(message per second), there are two
consumers to consume all message. Then the output is 2w mps. If my server
can handle 1w mps, I need two server to handle the 2w-mps-pressure. Now a
new consumer need to consume all message, too. As a result, output hits 3w
mps, so I need another one more server. For a conclusion, one more consumer
for all message, one more server?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rabbitmq-discuss