[rabbitmq-discuss] RabbitMQ Clustering Million of Users
Gavin M. Roy
gmr at myyearbook.com
Thu Jul 5 06:10:42 BST 2012
On Thursday, July 5, 2012 at 12:42 AM, eleanor wrote:
> how many nodes/workers - how they are connected.
On average, I see between 500 and 1000 connections and channels per node in my infrastructure.
> And also, all the nodes should be on
> LAN, so the connections are really fast (so that packets per second don't
> cause too much delay when nodes are exchanging messages).
Packets per second will be based upon how big your data is and how many messages you are pushing. I would be concerned about network infrastructure on a LAN if I were building out a high-veolicty messaging service.
> And how much RAM
> are we talking about.
Depends on your use case, I usually have 16GB to 24GB per node to make sure that in the event that consumers die off, I can keep my queues in RAM while they are brought back online. My use-cases are generally much different than what I gather you are doing, however.
> I would really like to know some specific numbers
> about how to build such a cluster -
You would really need to bench out your own use case, message size, publishing application, consuming application, network infrastructure all play a part.
> which VPS provider (or any other provide
> to chose from), what VPS to choose: how much CPU, RAM, disk, etc.
I don't use virtual servers, so I can't offer any advice there.
> And also: the GAE instances are then really not that great. You're saying
> that 1 node can compare to 400 GAE instances?
Probably not. Your best bet is to bench what your app throughput is and emulate end-users talking to your application.
As you're running through this process, I would figure out your application architecture.
How is data going to be routed through your system? What is the upper-end performance for your publishers and consumers? Will you have multiple consumers per queue? How many queues will you have? Will you be using mirrored queues? Are you using persistent messages? What kind of exchange will you be using? How much of your publishing will be across the cluster vs on the node the user is connected to (i.e. sending messages to queues on different machines)? These questions (and more) really factor into performance and scaling.
Your best bet, as I mentioned before, is to simulate what you want to do. Setup a small scale test, based some assumptions upon that, tweak settings and make changes, re-test. Once you're happy with that, test at a larger scale. When going through this exercise, I often will test at multiples of my intended throughput or target scale.
Hope this helped,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rabbitmq-discuss