[rabbitmq-discuss] Someone else with a nodedown error
tim at rabbitmq.com
Fri May 17 20:38:11 BST 2013
On 17 May 2013, at 16:05, Eric Berg wrote:
> We start all rabbit nodes via:
> sudo /etc/init.d/rabbit-server start
Ah. I've had problems on CentOS when doing that - it screwed up my permissions completely and I couldn't subsequently run the server. Re-installing and using `/sbin/service start rabbitmq-server` instead did the trick. You're not on a redhat variant are you?
> We do have chef managing this server and has since caused a restart on 2 of our 4 nodes, it is now temporarily disabled.
That could cause a number of problems, especially if nodes are clustered, there is a netsplit and chef restarts them in the wrong order - though they should just fail to start, rather than melt down your data centre. :)
> I will send you the log files for all 4 nodes dating back several days.
Cool thanks. I'll spend some time looking through those.
> One thing I did notice in the log file for 3 of the 4 nodes:
> =ERROR REPORT==== 16-May-2013::23:27:20 ===
> connection <0.25853.253>, channel 1 - soft error:
> "home node 'rabbit at rabbit-box' of durable queue 'my.queue.name' in vhost '/' is down or inaccessible",
Ah ok, that's probably nothing to worry about, though it may help with diagnosis.
> When looking at the log files you will notice many entries like:
> =INFO REPORT==== 17-May-2013::09:15:42 ===
> accepting AMQP connection <0.5117.0> (IP:55913 -> IP:5672)
> =WARNING REPORT==== 17-May-2013::09:15:42 ===
> closing AMQP connection <0.5117.0> (IP:55913 -> IP:5672):
> Those are our load balancers checking the node health, sorry for the log spam.
Ok sure - I'll filter those out. :)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rabbitmq-discuss