[rabbitmq-discuss] Hang on "starting database ...." remains in 2.8.2 cluster

Matt Pietrek mpietrek at skytap.com
Fri May 4 23:55:43 BST 2012


I've written this alias before about this topic, and the problem
remains in 2.8.2. See:
    http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2012-February/018414.html

I have a three node cluster running RabbitMQ 2.82/Erlang R13B03 on Ubuntu 10.04.

Once the cluster is up and running properly (as observed by the Web
UI), I then start/stop individual nodes in the cluster:
    rabbitmqctl stop
    rabbitmq-server

Inevitably one of the nodes won't come back up, waiting forever on
"starting" the database (no 30 second timeout... Forever.)

The only way to get all three nodes functioning again together is to
forcibly stop the other two nodes, then restart them all again.


The first item below is the console output as captured via nohup,
showing "starting database" as the last item.
The second item below is the last few lines of the rabbit@<node>.log
file, showing the node shutting down, then beginning to start up
again.

Is it likely that a newer Erlang version would help out?
What else can I provide to help diagnose this?

Thanks,

Matt

--------
node           : rabbit at util
app descriptor :
/usr/lib/rabbitmq/lib/rabbitmq_server-2.8.2/sbin/../ebin/rabbit.app
home dir       : /home/mpietrek
config file(s) : /home/mpietrek/work/var/run/rabbitmq.config
cookie hash    : pR5H9kY3Wra/XdLELT5hgQ==
log            :
/home/mpietrek/work/logs/util.mpietrek.internal.illumita.com/rabbit at util.log
sasl log       :
/home/mpietrek/work/logs/util.mpietrek.internal.illumita.com/rabbit at util-sasl.log
database dir   : /home/mpietrek/work/var/lib/rabbit at util
erlang version : 5.7.4

-- rabbit boot start
starting file handle cache server                                     ...done
starting worker pool                                                  ...done
starting database                                                     ...

--------

=INFO REPORT==== 4-May-2012::15:02:14 ===
    application: rabbitmq_management_agent
    exited: stopped
    type: permanent

=INFO REPORT==== 4-May-2012::15:02:14 ===
stopped TCP Listener on 0.0.0.0:5672

=INFO REPORT==== 4-May-2012::15:02:14 ===
    application: rabbit
    exited: stopped
    type: permanent

=INFO REPORT==== 4-May-2012::15:02:14 ===
    application: os_mon
    exited: stopped
    type: permanent

=INFO REPORT==== 4-May-2012::15:02:14 ===
    application: mnesia
    exited: stopped
    type: permanent

=INFO REPORT==== 4-May-2012::15:02:14 ===
Halting Erlang VM

=INFO REPORT==== 4-May-2012::15:02:52 ===
Limiting to approx 924 file handles (829 sockets)


More information about the rabbitmq-discuss mailing list