[rabbitmq-discuss] Performance Observations and Interesting Behavior
michael.laing at nytimes.com
Wed Feb 12 19:12:45 GMT 2014
All of our inter-cluster connections use shovels, both within and between
A cluster picks one of its nodes to run the shovel on. That node takes the
configured list of nodes in a remote cluster and picks one to connect to.
When local or remote nodes go down, things adjust. Mostly we see this
during rolling restarts. We have found it very rugged in production.
External clients connect via a DNS name which will round-robin to one of
the cluster nodes. We use Route 53 health checks to insure nodes are in
Our external clients use PHP, Java, node.js, and whatever else to connect -
possibly some of them are using clients smart enough to fail over by
themselves... so we also expose the DNS name of each node in the cluster as
On Wed, Feb 12, 2014 at 1:20 PM, Ron Cordell <ron.cordell at gmail.com> wrote:
> Thanks for the response - that's very interesting. We were quite
> interested in your setup when you posted to the rabbit list about the setup
> for the NYT :)
> How exactly do you distribute the connections? Does the rabbit driver do
> that for you by choosing from a list, or do you use some other method?
> On Tue, Feb 11, 2014 at 4:05 PM, Laing, Michael <michael.laing at nytimes.com
> > wrote:
>> That's interesting!
>> We have removed all the load balancers from our core configurations in
>> Amazon EC2 because we found they added no value, and, in fact provided
>> troublesome additional points of failure. (We do use ELBs to find websocket
>> endpoints in the client-facing retail layer)
>> Our core clusters in Oregon and Dublin each have 50 - 100 non-local
>> connections, randomly distributed, and are very stable.
>> We use DNS with health checks for internal client connections in lieu of
>> load balancers. Simple and rugged.
>> Michael Laing
>> On Tue, Feb 11, 2014 at 6:42 PM, Ron Cordell <ron.cordell at gmail.com>wrote:
>>> Hi all --
>>> We've been performance testing RabbitMQ on Linux as we're about to move
>>> our RabbitMQ infrastructure from Windows to Linux (as well as other
>>> things). I wanted to share some of what we observed and if people have any
>>> feedback. All tests were done using a 3-node cluster where most queues are
>>> HA, with an F5 configured to provide a virtual IP to the application. There
>>> is a single vHost.
>>> 1. On the same hardware the Linux installation easily outperforms the
>>> Windows installation. It also uses fewer resources for the same throughput.
>>> 2. The Windows cluster becomes unstable and nodes start dropping
>>> out/partitioning at around 1/3 max tested volume. The Linux cluster showed
>>> no instability whatsoever up to maximum throughput.
>>> 3. Creating a cluster with 2 RAM nodes and 1 Disc node has the same disk
>>> I/O requirements as 3 disc nodes. (This makes sense because as I believe
>>> the RAM nodes will persist to disk for HA queues).
>>> 4. (here is the interesting one) When the F5 is configured to load
>>> balance across the 3 nodes as a round-robin load balancer, maximum
>>> throughput is significantly less than if the F5 sends all traffic to a
>>> single node.
>>> I'd love any feedback, especially on #4.
>>> rabbitmq-discuss mailing list
>>> rabbitmq-discuss at lists.rabbitmq.com
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.rabbitmq.com
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rabbitmq-discuss