[rabbitmq-discuss] RabbitMQ Cluster On AWS VPC

Laing, Michael P. Michael.Laing at nytimes.com
Fri Dec 14 17:27:05 GMT 2012

We have not yet seen any performance issues. All traffic between instances in different availability zones within a region is charged: $.01/GB in us-east-1. So there is a cost, and the volume of traffic through your HA queues has an impact.

In us-east-1, the increase in latency by going across zones is < 10ms, in our small sample. We will add some continuous testing of this in the future across all zones.

Regions may vary in the amount of latency. And latency may vary (jitter).

*Rather then LAN vs WAN, it would be useful to know the design envelope for rabbitmq clusters in terms of latency and jitter, as these are measurable. Also if there is tuning that can be done.

All that said, I may choose to use federation rather than clustering across zones, because the rabbitmq response to a net split is so catastrophic, apparently, and I am not sure I want to live with that uncertainty in production. We are architected in such a way that we can do either with minor modifications. We would run a cluster in each zone and federate the clusters. It's more expensive though and more moving parts.

We use DynamoDB and Riak as reliable KV stores, as well as s3. Riak seems to have good tolerance of net splits. (*Why is it better than rabbitmq in this respect?)

The 'usual' failure mode in an AWS region is that a zone becomes 'compromised', meaning that some or all instances become unreachable and/or unresponsive. We would normally have 3 clustered nodes in different zones in the region. In this scenario, we would expect to lose a node completely and have it drop from the 'wholesale' Elastic Load Balancers. We would also lose all the other instances in that zone depending upon that node – they would drop from the 'retail' Elastic Load Balancers.

An often under-reported syndrome of zone failure in a region is that the regional 'control plane' is usually compromised as well. This means that one cannot create new instances, load balancers, volumes, etc. across the entire region (all zones). Hence you cannot rely on being able to add new resources to the zones that have not failed. This affects your capacity planning etc.

Hence, in the 'short' term, both wholesale and client apps reconnect via the ELBs to existing resources in the 'good' zones, which have been sized to handle this scenario.

At the same time, we start re-routing new connections to other healthy regions by automatically adjusting Route 53 weighted routing parameters, which are normally set to route only using least latency. The other regions will autoscale resources to handle the load. For example, us-east-1 is backed up by splitting its load between us-west-2 and eu-west-1. If necessary, we will also gradually shed the existing load from the compromised region by causing clients to reconnect.

Best regards,


From: 王俊波 <wangjunbo924 at gmail.com<mailto:wangjunbo924 at gmail.com>>
Reply-To: rabbitmq <rabbitmq-discuss at lists.rabbitmq.com<mailto:rabbitmq-discuss at lists.rabbitmq.com>>
Date: Thu, 13 Dec 2012 20:14:09 -0500
To: rabbitmq <rabbitmq-discuss at lists.rabbitmq.com<mailto:rabbitmq-discuss at lists.rabbitmq.com>>
Subject: Re: [rabbitmq-discuss] RabbitMQ Cluster On AWS VPC

Hi Laing, Is there any performance issue?
And I believe that it's much more expensive to build cluster across AWS zones. Does it?
One another question, I'm wondering how to connect to the cluster in your experience? A Load Balancer before cluster or DNS Round Robin or the Java Client Libraries only?

2012/12/13 Laing, Michael P. <Michael.Laing at nytimes.com<mailto:Michael.Laing at nytimes.com>>
Clustering across AWS availability zones works fine in our experience so


On 12/13/12 5:48 AM, "Simon MacMullen" <simon at rabbitmq.com<mailto:simon at rabbitmq.com>> wrote:

>On 13/12/12 10:44, 王俊波 wrote:
>> We are going to build RabbitMQ cluster with HA on AWS VPC. Suppose there
>> are two subnets in the VPC, and they are in different available zones(
>> for more info about AWS available zone, please visit:
>>   Then I launch instances in both subnets:
>> instance X in subnet1(in AWS zone us-east-1a)
>> instance Y in subnet2(in AWS zone us-east-1b)
>> So can I create a RabbitMQ cluster with HA consisting of both X and Y?
>This is a bad idea - RabbitMQ clusters do not tolerate network
>partitions well. Read http://www.rabbitmq.com/partitions.html for more
>> Another question is about federation plugin. If I build federation links
>> for X and Y symmetrically with max-hop=1, is there any way that whenever
>> a message is consumed from node X then the corresponding one in node Y
>> is dequeued automatically with no client actually consuming it?
>No, I'm afraid not.
>Cheers, Simon
>Simon MacMullen
>RabbitMQ, VMware
>rabbitmq-discuss mailing list
>rabbitmq-discuss at lists.rabbitmq.com<mailto:rabbitmq-discuss at lists.rabbitmq.com>

rabbitmq-discuss mailing list
rabbitmq-discuss at lists.rabbitmq.com<mailto:rabbitmq-discuss at lists.rabbitmq.com>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20121214/0439798c/attachment.htm>

More information about the rabbitmq-discuss mailing list