[rabbitmq-discuss] rabbit cluster keeps crashing

Matthew Sackman matthew at lshift.net
Fri Apr 2 16:24:11 BST 2010


Hi,

On Wed, Mar 24, 2010 at 06:07:09PM -0700, LoOoD wrote:
> We've been having a problem with our rabbit servers. We have a cluster setup,
> one disk node and two ram nodes. All clients only connect to
> the ram nodes. The ram nodes eventually stop accepting new connections.
> 
> The only way to fix so far is to restart the affected ram node. 
> 
> We're running 1.7.2-1 (installed via the deb repo) on ubuntu 8.10 64bit.
> 
> Here is are some crash reports from rabbit-sasl.log:

Hmm. The nodedown simply suggests that Erlang has lost touch with the other
nodes. What kind of a network are you suing - is this a LAN or WAN or...
and do you see any sort of packet loss with other applications?

You may want to experiment with lower values of net_ticktime in your
rabbitmq.config file:

[{kernel, [{net_ticktime, 5}]}].

See http://ftp.sunet.se/pub/lang/erlang/doc/man/net_kernel.html#set_net_ticktime-1
for documentation. It could just be that getting the nodes to talk to each
other more frequently will solve this. On the other hand, that's likely to
only be a problem if there are large periods of inactivity in the cluster -
is this the case?

Matthew




More information about the rabbitmq-discuss mailing list