Hi everyone,<div><br></div><div>I have a RabbitMQ cluster with two disk nodes.</div><div><br></div><div>Last night something happened to the network and both RabbitMQ servers crashed.</div><div><br></div><div>node1 reports the following:</div>
<div><div><br></div><div>> BOOT FAILED</div><div>> ===========</div><div>></div><div>> Timeout contacting cluster nodes: [rabbit@node2].</div></div><div><br></div><div>node2 reports the reciprocal</div><div><div>
<br>> BOOT FAILED</div><div>> ===========</div><div>></div><div>> Timeout contacting cluster nodes: [rabbit@node1].</div><div><br></div><div><br></div><div>My instinct, for the sake of uptime, is to say: "okay, forget node2, lets break the cluster and bring node1 online".</div>
<div><br></div><div>My problem is that according to the docs I need to issue a "rabbitmqctl force_reset", which I can't do unless the server is running.</div><div><br></div><div>I tried starting it using "rabbitmq-server -detached" but the server just exited after loading plugins.</div>
<div><br></div><div>Does anyone know the right course of action in this scenario?</div><div><br></div><div>Many Thanks!</div><div><br></div><div>-Dave</div>
</div>