[rabbitmq-discuss] timeout_waiting_for_tables on node that has not changed node name
Elias Levy
fearsome.lucidity at gmail.com
Fri Dec 9 20:33:57 GMT 2011
Last night we had to reboot a RabbitMQ node in a 3 node cluster within EC2.
The node failed to restart with the dreaded timeout_waiting_for_tables
error.
Looking as past discussion on that topic it is clear that the most common
reason for it is a node name change, either because the node name contains
the IP address, the hostname changed, or a new node is being provisioned on
an image with an old mnesia DB with some other nodename.
None of those appears to apply in our current situation. The node name
does not include the IP address and the node name did not change, as can be
seen in the start up logs. Just to be sure we set the node name in the
/etc/rabbitmq/rabbitmq-env.conf file and attempted to restart, again
without success.
I enabled mnesia debugging at the trace level and it does not provide any
useful information as to what is causing the timeout. The cluster has
developed a backlog of persistent messages in two of the queues (about 70K
in total), but from looking at what tables the system complains about it
does not appear those are the tables its trying to sync. All the other
metadata (users, exchanges, bindings, queues) is of very small size, so 30
seconds should be sufficient time.
While we could wipe the mnesia state from the node, we'd like to find out
why this happens and whether it can be repaired, for future reference.
The start-up log is attached below.
Any ideas?
-----------------
Activating RabbitMQ plugins ...
7 plugins activated:
* amqp_client-2.5.1
* mochiweb-1.3-rmq2.5.1-git9a53dbd
* rabbitmq_management-2.5.1
* rabbitmq_management_agent-2.5.1
* rabbitmq_mochiweb-2.5.1
* rabbitmq_stomp-2.5.1
* webmachine-1.7.0-rmq2.5.1-hg0c4b60a
Mnesia('rabbit at queue-beta1-int'): mnesia_monitor starting: <0.68.0>
Mnesia('rabbit at queue-beta1-int'): Version: "4.4.17"
Mnesia('rabbit at queue-beta1-int'): Env access_module: mnesia
Mnesia('rabbit at queue-beta1-int'): Env auto_repair: true
Mnesia('rabbit at queue-beta1-int'): Env backup_module: mnesia_backup
Mnesia('rabbit at queue-beta1-int'): Env debug: trace
Mnesia('rabbit at queue-beta1-int'): Env dir:
"/var/lib/rabbitmq/mnesia/rabbit at queue-beta1-int"
Mnesia('rabbit at queue-beta1-int'): Env dump_log_load_regulation: false
Mnesia('rabbit at queue-beta1-int'): Env dump_log_time_threshold: 180000
Mnesia('rabbit at queue-beta1-int'): Env dump_log_update_in_place: true
Mnesia('rabbit at queue-beta1-int'): Env dump_log_write_threshold: 1000
Mnesia('rabbit at queue-beta1-int'): Env embedded_mnemosyne: false
Mnesia('rabbit at queue-beta1-int'): Env event_module: mnesia_event
Mnesia('rabbit at queue-beta1-int'): Env extra_db_nodes: []
Mnesia('rabbit at queue-beta1-int'): Env ignore_fallback_at_startup: false
Mnesia('rabbit at queue-beta1-int'): Env fallback_error_function:
{mnesia,lkill}
Mnesia('rabbit at queue-beta1-int'): Env max_wait_for_decision: infinity
Mnesia('rabbit at queue-beta1-int'): Env schema_location: opt_disc
Mnesia('rabbit at queue-beta1-int'): Env core_dir: false
Mnesia('rabbit at queue-beta1-int'): Env pid_sort_order: false
Mnesia('rabbit at queue-beta1-int'): Env no_table_loaders: 2
Mnesia('rabbit at queue-beta1-int'): Env dc_dump_limit: 4
Mnesia('rabbit at queue-beta1-int'): Env send_compressed: 0
Mnesia('rabbit at queue-beta1-int'): Mnesia debug level set to trace
Mnesia('rabbit at queue-beta1-int'): mnesia_subscr starting: <0.69.0>
Mnesia('rabbit at queue-beta1-int'): mnesia_locker starting: <0.70.0>
Mnesia('rabbit at queue-beta1-int'): mnesia_recover starting: <0.71.0>
Mnesia('rabbit at queue-beta1-int'): mnesia_tm starting: <0.72.0>
Mnesia('rabbit at queue-beta1-int'): Schema initiated from: disc
Mnesia('rabbit at queue-beta1-int'): Transaction log dump initiated by
scan_decisions
Mnesia('rabbit at queue-beta1-int'): Transaction log dump initiated by
startup: {needs_dump,0}
Mnesia('rabbit at queue-beta1-int'): Transaction log dump initiated by
startup: already_dumped
Mnesia('rabbit at queue-beta1-int'): Initial dump of log during startup:
[dumped,
dumped]
Mnesia('rabbit at queue-beta1-int'): mnesia_controller starting: <0.98.0>
Mnesia('rabbit at queue-beta1-int'): mnesia_downs = []
Mnesia('rabbit at queue-beta1-int'): Intend to load tables: []
+---+ +---+
| | | |
| | | |
| | | |
| +---+ +-------+
| |
| RabbitMQ +---+ |
| | | |
| v2.5.1 +---+ |
| |
+-------------------+
AMQP 0-9-1 / 0-9 / 0-8
Copyright (C) 2007-2011 VMware, Inc.
Licensed under the MPL. See http://www.rabbitmq.com/
node : rabbit at queue-beta1-int
app descriptor :
/usr/lib/rabbitmq/lib/rabbitmq_server-2.5.1/sbin/../ebin/rabbit.app
home dir : /var/lib/rabbitmq
config file(s) : /etc/rabbitmq/rabbitmq.config
cookie hash : a+Jg2nl357GwYTLG/0y3Lg==
log : /var/log/rabbitmq/rabbit at queue-beta1-int.log
sasl log : /var/log/rabbitmq/rabbit at queue-beta1-int-sasl.log
database dir : /var/lib/rabbitmq/mnesia/rabbit at queue-beta1-int
erlang version : 5.8.3
-- rabbit boot start
starting file handle cache server
...done
starting worker pool
...done
starting database
...BOOT ERROR: FAILED
Reason: {error,
{timeout_waiting_for_tables,
[rabbit_user,rabbit_user_permission,rabbit_vhost,
rabbit_listener,rabbit_durable_route,
rabbit_semi_durable_route,rabbit_route,rabbit_reverse_route,
rabbit_topic_trie_edge,rabbit_topic_trie_binding,
rabbit_durable_exchange,rabbit_exchange,
rabbit_exchange_serial,rabbit_durable_queue,rabbit_queue]}}
Stacktrace: [{rabbit_mnesia,wait_for_tables,1},
{rabbit_mnesia,check_schema_integrity,0},
{rabbit_mnesia,ensure_schema_integrity,0},
{rabbit_mnesia,init_db,3},
{rabbit_mnesia,init,0},
{rabbit,'-run_boot_step/1-lc$^1/1-1-',1},
{rabbit,run_boot_step,1},
{rabbit,'-start/2-lc$^0/1-0-',1}]
Mnesia('rabbit at queue-beta1-int'): mnesia_controller terminated: shutdown
Mnesia('rabbit at queue-beta1-int'): mnesia_tm terminated: shutdown
Mnesia('rabbit at queue-beta1-int'): mnesia_recover terminated: shutdown
Mnesia('rabbit at queue-beta1-int'): mnesia_locker terminated: shutdown
Mnesia('rabbit at queue-beta1-int'): mnesia_subscr terminated: shutdown
Mnesia('rabbit at queue-beta1-int'): mnesia_monitor terminated: shutdown
{"Kernel pid
terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{rabbit,failure_during_boot}}}}}"}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20111209/9dddc4b7/attachment.htm>
More information about the rabbitmq-discuss
mailing list