[rabbitmq-discuss] Problem with RabbitMQ HA cluster on EC2 (Windows)

Richard Urwin richard at psonar.com
Tue Oct 16 13:46:18 BST 2012

Hi everyone,


I’m having problems clustering RabbitMQ on EC2 in Windows.


Apologies for the length of this post, but I thought it best to include all my steps:

a)      to help with diagnosis

b)      for anyone else who wants to set up a Windows RabbitMQ HA cluster on EC2 (if we manage to resolve the issue!)


Here’s what I’ve done so far:


1)      Started 2x EC2 Windows Server 2008 RC2 instances in a ‘Rabbit’ security group


2)      On each EC2 instance:

-          Installed Erlang R15B02 Windows 64 Bit Binary File with MSVCR100

-          installed RabbitMQ v2.8.7

-          installed the management plugin

-          reinstalled the service to enable the plugin: http://www.rabbitmq.com/plugins.html#windows-restart


-          opened up Windows Firewall port 4369 (for Erlang Epmd)

-          opened up Windows Firewall port 5672 (the node port)

-          opened up Windows Firewall port range 55700-55800 (for erlang node kernel communication)

-          opened up Windows Firewall port 55672 (for the management website, not strictly necessary)


-          copied an identical ‘.erlang.cookie’ file to:

o   C:\Windows

o   C:\Users\Administrator


-          Created ‘rabbit.config’:

o   Located in C:\Users\Administrator\AppData\Roaming\RabbitMQ’

o   with contents: [{kernel, [{inet_dist_listen_min, 55700}, {inet_dist_listen_max, 55800}]}].


3)      In the ‘Rabbit’ security group:

-          opened up ports 4369, 5672, 55672, 55700-55800


NOTE: I initially had problems installing on one instance, creating an AMI from the installation and starting up another instance from that AMI – ‘rabbitmqctl status’ always indicated the node wasn’t running, so I uninstalled RabbitMQ then reinstalled it and this worked. My guess would be it’s something related to the computer name, but I’m not sure what, since the database should have been recreated with the new instance name (if indeed it did change) when it started up. I can’t think of anything else that would cause this (any suggestions?)


4)      Followed the clustering instructions: http://www.rabbitmq.com/clustering.html.


The problem is when attempting to cluster. Running the command:


C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-2.8.7\sbin>rabbitmqctl cluster rabbit at ec2-XX-XX-XX-XX.compute-1.amazonaws.com


I get the result:


Clustering node 'rabbit at IP-0XXXXXXX' with ['rabbit at ec2- XX-XX-XX-XX.compute-1.amazonaws.com'] ...

Error: {no_running_cluster_nodes,['rabbit at ec2- XX-XX-XX-XX.compute-1.amazonaws.com'], ['rabbit at ec2- XX-XX-XX-XX.compute-1.amazonaws.com']}


I think this is because I’m attempting to use the EC2 Public DNS.


I then added mappings to my hosts file, to map to my two instances:


10.XXX.XX.XX     rabbit01               # Public DNS: ec2-XX-XX-XX-XX.compute-1.amazonaws.com

10.XXX.XX.XX     rabbit02               # Public DNS: ec2-XX-XX-XX-XX.compute-1.amazonaws.com


Running the command (with different hostname):


C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-2.8.7\sbin>rabbitmqctl cluster rabbit at rabbit02 <mailto:rabbit at ec2-XX-XX-XX-XX.compute-1.amazonaws.com> 


I get the same result:


Clustering node 'rabbit@ IP-0AAXXXXX’ with [rabbit at rabbit02] ...

Error: {no_running_cluster_nodes,[rabbit at rabbit02],[rabbit at rabbit02]}


I know this mapping can be used to access the other Windows computer, because I can use it to access the management console in a web browser:




I then tried to change the computer name from IP-0AAXXXXX to rabbit01 (in case this was a problem) but upon reboot, it hadn’t changed. 


Does anyone have any suggestions?




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20121016/31a17364/attachment.htm>

More information about the rabbitmq-discuss mailing list