[rabbitmq-discuss] Outage, brokers refuse to start back up

Simon MacMullen simon at rabbitmq.com
Fri Dec 16 15:37:04 GMT 2011


Thanks. So timeout_waiting_for_tables can also be triggered by certain 
types of network failure or reconfiguration. We're looking at adding 
better diagnostics for these cases...

Cheers, Simon

On 09/12/11 15:54, James Carr wrote:
> Yes. I shut both down and whIen trying to restart either the master
> node or the other node I got the same error.
>
> On Fri, Dec 9, 2011 at 9:49 AM, Simon MacMullen<simon at rabbitmq.com>  wrote:
>> When you start a cluster up again after it's entirely shut down, any node
>> other than the last one to be shut down will wait at startup for the last
>> one to reappear (since the last node to shut down is authoritative). If the
>> last node does not reappear in 30s then the error message you quote is
>> shown. Did you get the same error on both nodes?
>>
>> Cheers, Simon
>>
>>
>> On 09/12/11 15:03, James Carr wrote:
>>>
>>> Okay, so I backed up our mnesia dir and wiped it clean on both boxes.
>>> Brokers started fine. I then copied the DCD and DCL files back over
>>> and my users, exchanges, and queues were back.
>>>
>>> Is there a recommended way to protect against such an outage?
>>>
>>> Thanks,
>>> James
>>>
>>>
>>> On Fri, Dec 9, 2011 at 8:41 AM, James Carr<james.r.carr at gmail.com>    wrote:
>>>>
>>>> So our datacenter had a power failure yesterday and the brokers now
>>>> REFUSE to start back up. I can send the erl_crash.dump as needed (it
>>>> is quite large) but here are the other logs.
>>>>
>>>> Any ideas?
>>>>
>>>> Thanks,
>>>> James
>>>>
>>>>
>>>> Rabbitmq log:
>>>>
>>>> =INFO REPORT==== 9-Dec-2011::08:35:32 ===
>>>> Limiting to approx 924 file handles (829 sockets)
>>>>
>>>> =ERROR REPORT==== 9-Dec-2011::08:36:02 ===
>>>> FAILED
>>>> Reason: {error,
>>>>             {timeout_waiting_for_tables,
>>>>                 [rabbit_user,rabbit_user_permission,rabbit_vhost,
>>>>                  rabbit_durable_route,rabbit_durable_exchange,
>>>>                  rabbit_durable_queue]}}
>>>> Stacktrace: [{rabbit_mnesia,wait_for_tables,1},
>>>>              {rabbit_mnesia,check_schema_integrity,0},
>>>>              {rabbit_mnesia,ensure_schema_integrity,0},
>>>>              {rabbit_mnesia,init,0},
>>>>              {rabbit,'-run_boot_step/1-lc$^1/1-1-',1},
>>>>              {rabbit,run_boot_step,1},
>>>>              {rabbit,'-start/2-lc$^0/1-0-',1},
>>>>              {rabbit,start,2}]
>>>>
>>>> =INFO REPORT==== 9-Dec-2011::08:36:03 ===
>>>>     application: rabbit
>>>>     exited: {bad_return,{{rabbit,start,[normal,[]]},
>>>>                          {'EXIT',{rabbit,failure_during_boot}}}}
>>>>     type: permanent
>>>>
>>>>
>>>> startup_log:
>>>>
>>>> Activating RabbitMQ plugins ...
>>>> 11 plugins activated:
>>>> * amqp_client-2.6.1
>>>> * erlando-2.6.1
>>>> * mochiweb-1.3-rmq2.6.1-git9a53dbd
>>>> * rabbitmq_federation-2.6.1
>>>> * rabbitmq_management-2.6.1
>>>> * rabbitmq_management_agent-2.6.1
>>>> * rabbitmq_management_visualiser-2.6.1
>>>> * rabbitmq_mochiweb-2.6.1
>>>> * rabbitmq_shovel-2.6.1
>>>> * rabbitmq_shovel_management-2.6.1
>>>> * webmachine-1.7.0-rmq2.6.1-hg0c4b60a
>>>>
>>>>
>>>> +---+   +---+
>>>> |   |   |   |
>>>> |   |   |   |
>>>> |   |   |   |
>>>> |   +---+   +-------+
>>>> |                   |
>>>> | RabbitMQ  +---+   |
>>>> |           |   |   |
>>>> |   v2.6.1  +---+   |
>>>> |                   |
>>>> +-------------------+
>>>> AMQP 0-9-1 / 0-9 / 0-8
>>>> Copyright (C) 2007-2011 VMware, Inc.
>>>> Licensed under the MPL.  See http://www.rabbitmq.com/
>>>>
>>>> node           : rabbit at brokerm02p
>>>> app descriptor :
>>>> /usr/lib/rabbitmq/lib/rabbitmq_server-2.6.1/sbin/../ebin/rabbit.app
>>>> home dir       : /var/lib/rabbitmq
>>>> config file(s) : /etc/rabbitmq/rabbitmq.config
>>>> cookie hash    : Mg6GXWPn9Lrj9HC/D14CWA==
>>>> log            : /var/log/rabbitmq/rabbit at brokerm02p.log
>>>> sasl log       : /var/log/rabbitmq/rabbit at brokerm02p-sasl.log
>>>> database dir   : /var/lib/rabbitmq/mnesia/rabbit at brokerm02p
>>>> erlang version : 5.8.4
>>>>
>>>> -- rabbit boot start
>>>> starting file handle cache server
>>>> ...done
>>>> starting worker pool
>>>>   ...done
>>>> starting database
>>>> ...BOOT ERROR: FAILED
>>>> Reason: {error,
>>>>             {timeout_waiting_for_tables,
>>>>                 [rabbit_user,rabbit_user_permission,rabbit_vhost,
>>>>                  rabbit_durable_route,rabbit_durable_exchange,
>>>>                  rabbit_durable_queue]}}
>>>> Stacktrace: [{rabbit_mnesia,wait_for_tables,1},
>>>>              {rabbit_mnesia,check_schema_integrity,0},
>>>>              {rabbit_mnesia,ensure_schema_integrity,0},
>>>>              {rabbit_mnesia,init,0},
>>>>              {rabbit,'-run_boot_step/1-lc$^1/1-1-',1},
>>>>              {rabbit,run_boot_step,1},
>>>>              {rabbit,'-start/2-lc$^0/1-0-',1},
>>>>              {rabbit,start,2}]
>>>> {"Kernel pid
>>>> terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{rabbit,failure_during_boot}}}}}"}
>>>>
>>>> startup_err:
>>>>
>>>> Erlang has closed
>>>>
>>>> Crash dump was written to: erl_crash.dump
>>>> Kernel pid terminated (application_controller)
>>>>
>>>> ({application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{rabbit,failure_during_boot}}}}})
>>>
>>> _______________________________________________
>>> rabbitmq-discuss mailing list
>>> rabbitmq-discuss at lists.rabbitmq.com
>>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>>
>>
>>
>> --
>> Simon MacMullen
>> RabbitMQ, VMware
>> _______________________________________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.rabbitmq.com
>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss


-- 
Simon MacMullen
RabbitMQ, VMware


More information about the rabbitmq-discuss mailing list