I'm using 6 servers to make a cluster and they are all disk nodes. I use rabbitmq for collecting log file for our website. Now at the peak hour, the publish rate is about 30k message per second. There are 2 main consumers(hdfs and elasticsearch) and each one need to handle all message, so the delivery rate hit about 60k per second.<div><br></div><div>In my scenario, a single server can hold 10k delivery rate and I use 6 node to load balance the pressure. My solution is that I created 2 queues on each node. Each message is with a random routing-key(something like message.0, message.1, etc) to distribute the pressure to every node.&nbsp;</div><div><br></div><div>What confused me is:</div><div><br></div><div><ol><li><span style="line-height: normal;">All message send to one node. Should I use a HA Proxy to load balance this publish pressure?</span></li><li><span style="line-height: normal;">Is there any performance difference between Durable Queues and Transient Queues?</span></li><li><span style="line-height: normal;">Is there any performance difference between Memory Node and Disk Node? What I know is the difference between memory node and disk node is only about the meta data such as queue configuration.</span></li><li>How can I import the performance in publish and delivery codes? I've researched and I know several methods:</li><ul><li>disable the confirm mechanism(in publish codes?)</li><li>enable HiPE(I've done that and it helped a lot)</li></ul><li>For example, input is 1w mps(message per second), there are two consumers to consume all message. Then the output is 2w mps. If my server can handle 1w mps, I need two server to handle the 2w-mps-pressure. Now a new consumer need to consume all message, too. As a result, output hits 3w mps, so I need another one more server. For a conclusion, one more consumer for all message, one more server?</li></ol></div>