[rabbitmq-discuss] Outage, brokers refuse to start back up

Simon MacMullen simon at rabbitmq.com
Fri Dec 9 15:49:12 GMT 2011


When you start a cluster up again after it's entirely shut down, any 
node other than the last one to be shut down will wait at startup for 
the last one to reappear (since the last node to shut down is 
authoritative). If the last node does not reappear in 30s then the error 
message you quote is shown. Did you get the same error on both nodes?

Cheers, Simon

On 09/12/11 15:03, James Carr wrote:
> Okay, so I backed up our mnesia dir and wiped it clean on both boxes.
> Brokers started fine. I then copied the DCD and DCL files back over
> and my users, exchanges, and queues were back.
>
> Is there a recommended way to protect against such an outage?
>
> Thanks,
> James
>
>
> On Fri, Dec 9, 2011 at 8:41 AM, James Carr<james.r.carr at gmail.com>  wrote:
>> So our datacenter had a power failure yesterday and the brokers now
>> REFUSE to start back up. I can send the erl_crash.dump as needed (it
>> is quite large) but here are the other logs.
>>
>> Any ideas?
>>
>> Thanks,
>> James
>>
>>
>> Rabbitmq log:
>>
>> =INFO REPORT==== 9-Dec-2011::08:35:32 ===
>> Limiting to approx 924 file handles (829 sockets)
>>
>> =ERROR REPORT==== 9-Dec-2011::08:36:02 ===
>> FAILED
>> Reason: {error,
>>             {timeout_waiting_for_tables,
>>                 [rabbit_user,rabbit_user_permission,rabbit_vhost,
>>                  rabbit_durable_route,rabbit_durable_exchange,
>>                  rabbit_durable_queue]}}
>> Stacktrace: [{rabbit_mnesia,wait_for_tables,1},
>>              {rabbit_mnesia,check_schema_integrity,0},
>>              {rabbit_mnesia,ensure_schema_integrity,0},
>>              {rabbit_mnesia,init,0},
>>              {rabbit,'-run_boot_step/1-lc$^1/1-1-',1},
>>              {rabbit,run_boot_step,1},
>>              {rabbit,'-start/2-lc$^0/1-0-',1},
>>              {rabbit,start,2}]
>>
>> =INFO REPORT==== 9-Dec-2011::08:36:03 ===
>>     application: rabbit
>>     exited: {bad_return,{{rabbit,start,[normal,[]]},
>>                          {'EXIT',{rabbit,failure_during_boot}}}}
>>     type: permanent
>>
>>
>> startup_log:
>>
>> Activating RabbitMQ plugins ...
>> 11 plugins activated:
>> * amqp_client-2.6.1
>> * erlando-2.6.1
>> * mochiweb-1.3-rmq2.6.1-git9a53dbd
>> * rabbitmq_federation-2.6.1
>> * rabbitmq_management-2.6.1
>> * rabbitmq_management_agent-2.6.1
>> * rabbitmq_management_visualiser-2.6.1
>> * rabbitmq_mochiweb-2.6.1
>> * rabbitmq_shovel-2.6.1
>> * rabbitmq_shovel_management-2.6.1
>> * webmachine-1.7.0-rmq2.6.1-hg0c4b60a
>>
>>
>> +---+   +---+
>> |   |   |   |
>> |   |   |   |
>> |   |   |   |
>> |   +---+   +-------+
>> |                   |
>> | RabbitMQ  +---+   |
>> |           |   |   |
>> |   v2.6.1  +---+   |
>> |                   |
>> +-------------------+
>> AMQP 0-9-1 / 0-9 / 0-8
>> Copyright (C) 2007-2011 VMware, Inc.
>> Licensed under the MPL.  See http://www.rabbitmq.com/
>>
>> node           : rabbit at brokerm02p
>> app descriptor :
>> /usr/lib/rabbitmq/lib/rabbitmq_server-2.6.1/sbin/../ebin/rabbit.app
>> home dir       : /var/lib/rabbitmq
>> config file(s) : /etc/rabbitmq/rabbitmq.config
>> cookie hash    : Mg6GXWPn9Lrj9HC/D14CWA==
>> log            : /var/log/rabbitmq/rabbit at brokerm02p.log
>> sasl log       : /var/log/rabbitmq/rabbit at brokerm02p-sasl.log
>> database dir   : /var/lib/rabbitmq/mnesia/rabbit at brokerm02p
>> erlang version : 5.8.4
>>
>> -- rabbit boot start
>> starting file handle cache server                                     ...done
>> starting worker pool                                                  ...done
>> starting database
>> ...BOOT ERROR: FAILED
>> Reason: {error,
>>             {timeout_waiting_for_tables,
>>                 [rabbit_user,rabbit_user_permission,rabbit_vhost,
>>                  rabbit_durable_route,rabbit_durable_exchange,
>>                  rabbit_durable_queue]}}
>> Stacktrace: [{rabbit_mnesia,wait_for_tables,1},
>>              {rabbit_mnesia,check_schema_integrity,0},
>>              {rabbit_mnesia,ensure_schema_integrity,0},
>>              {rabbit_mnesia,init,0},
>>              {rabbit,'-run_boot_step/1-lc$^1/1-1-',1},
>>              {rabbit,run_boot_step,1},
>>              {rabbit,'-start/2-lc$^0/1-0-',1},
>>              {rabbit,start,2}]
>> {"Kernel pid terminated",application_controller,"{application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{rabbit,failure_during_boot}}}}}"}
>>
>> startup_err:
>>
>> Erlang has closed
>>
>> Crash dump was written to: erl_crash.dump
>> Kernel pid terminated (application_controller)
>> ({application_start_failure,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{rabbit,failure_during_boot}}}}})
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss


-- 
Simon MacMullen
RabbitMQ, VMware


More information about the rabbitmq-discuss mailing list