[rabbitmq-discuss] Node failure for a mirrored queue

Tim Watson watson.timothy at gmail.com
Mon Feb 11 08:49:14 GMT 2013


Hi Vladimir,

On 11 Feb 2013, at 07:52, Бородин Владимир wrote:

> Hi all.
> 
> I'm testing mirrored queues in v. 3.0.1. If I stop node with 'rabbitmqctl stop_app', clients and other nodes behave normally (clients reconnect to other nodes because of a tcp balancer and other nodes continue to serve the queue). But if I close one node from others with iptables or kill it with Alt+SysRq+b, the cluster stops working for a long period of time.

What exactly stops working? The whole cluster, all queues/exchanges are inaccessible? Or just this particular mirrored queue? 

> Is there any kind of a timeout, after which the node is considered to be dead by others?

No, although the os networking stack can take a while to notice peers are gone. Erlang does have a kind of heartbeat mechanism though, which should notice in a fairly timely fashion that another node has gone away. How long does the 'long period of time' last exactly?

> It does not depend on which node I kill - the primary or one of the slaves. There are 3 nodes in a cluster, the queue is mirrored by a policy like that '/	HA	^(?!amq\\.).*	{"ha-mode":"all"}	0'.
> If I should give extra info for understanding of a problem, tell me, please. Thanks.
> 
> --
> Vladimir
> 
> 
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss



More information about the rabbitmq-discuss mailing list