[rabbitmq-discuss] cluster crash when setting ha-mode=nodes

Tue Dec 17 10:57:15 GMT 2013

Nope, nothing in the logs. But on another cluster I just did rabbitmqctl stop_app and reset on one node, without setting ha-nodes first, and then it worked fine do decommission it. 

On 17 december 2013 at 17:55:41, Simon MacMullen (simon at rabbitmq.com) wrote:

On 13/12/2013 07:26, carlhoerberg wrote:  
> Set ha-mode=nodes on a all of vhosts, from ha-mode=all, to gracefully remove  
> a node. All queues failover successfully, but then /api/overview stopped  
> working, nor could we issue rabbitmqctl cluster_status. Tried to stop the  
> node we were rotating out, it didn't respond. Had to "kill" it after a  
> couple of minutes. Node 1 and 2 still doesn't respond, tries to issue "sudo  
> rabbitmqctl eval 'application:stop(rabbitmq_management).'" on node 1,  
> successfully, but on node 2 it just hangs..  
>  
> Kills node2, node1 rabbitmq_mgmt starts, but still can't query  
> /api/overview. Kills node1 too, on start up we get a lot of "home node  
> 'rabbit at lemur03' of durable queue 'celery' in vhost 'shycoqhj' is down or  
> inaccessible", but lemur03 shouldn't be master for any queue anymore!  
>  
> RabbitMQ 3.2.1, Erlang R16B01  

It's hard to guess what might be going on from this description. Do you  
have anything in the logs?  

Cheers, Simon  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20131217/1f51cb08/attachment.html>