[rabbitmq-discuss] Unable to Start Cluster Nodes on Windows
Ron.Cordell at RelayHealth.com
Wed Aug 28 21:43:33 BST 2013
To answer all my questions and to leave a trail for others :) ::
We use a Vormetric encryption agent on each of the RMQ nodes because the messages that the RMQ clusters handle must be encrypted at rest.
It turns out that, somehow, the Windows Firewall policies were changed in a way that prevented the Vormetric agent from starting, which in turn prevented the Erlang process from accessing its files on the file system. It appeared that the Erlang process would just hang waiting to access its files, and the RabbitMQ service would not start.
Correcting the Vormetric agent situation corrected the issue.
Sorry for the panic email… ;)
From: <Cordell>, Ronald Cordell <Ron.Cordell at RelayHealth.com<mailto:Ron.Cordell at RelayHealth.com>>
Reply-To: Discussions RabbitMQ <rabbitmq-discuss at lists.rabbitmq.com<mailto:rabbitmq-discuss at lists.rabbitmq.com>>
Date: Wednesday, August 28, 2013 11:18 AM
To: Discussions RabbitMQ <rabbitmq-discuss at lists.rabbitmq.com<mailto:rabbitmq-discuss at lists.rabbitmq.com>>
Subject: [rabbitmq-discuss] Unable to Start Cluster Nodes on Windows
We have a 5 node cluster that went down last night as a result of a Windows patching event where the patch scripts didn't insure the integrity of the cluster in between stopping/patching/starting nodes.
We are unable to now start any node in the cluster – the rabbitmqctl.bat start says it is unable to start the service.
Attempting to look at logs is not possible until the machine is rebooted because the Erlang process has a lock and we are unable to kill the Elang process.
This is RabbitMQ 3.1.3 with Erlang 16B.
First question I have is what the heck do we have to do to kill the Erlang process? It doesn't respond to kill <pid> or killing the process from the task dialog. Since we have RabbitMQ installed as a service, we have to set the service to not start automatically to prevent the erlang process from starting.
Unfortunately, rebooting the machine doesn't allow access to the logs. Even though there's no Erlang process running, the log files remain unaccessible so I can't find out what's going on.
At this point I'm considering uninstalling/re-installing in order to see if we can at least get the cluster up and running again, but I'm afraid we'll lose all messages.
Thanks for any ideas…
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rabbitmq-discuss