[rabbitmq-discuss] Recovering from power failure

tsuraan tsuraan at gmail.com
Mon Aug 3 18:08:41 BST 2009


I have a machine where rabbit (1.6.0) died brutally due to a power
failure, and now it can't start anymore.  It gets to starting
persister, and then we get:

starting persister            ...Erlang has closed
{"init terminating in
do_boot",{{nocatch,{error,{cannot_start_application,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{badmatch,{error,{{{badmatch,eof},[{rabbit_persister,internal_load_snapshot,2},{rabbit_persister,init,1},{gen_server,init_it,6},{proc_lib,init_p,5}]},{child,undefined,rabbit_persister,{rabbit_persister,start_link,[]},transient,100,worker,[rabbit_persister]}}}},[{rabbit,start_child,1},{rabbit,'-start/2-fun-4-',0},{rabbit,'-start/2-fun-0-',1},{lists,foreach,2},{rabbit,start,2},{application_master,start_it_old,4}]}}}}}}},[{init,start_it,1},{init,start_em,1}]}}

>From the eof, I'm guessing that some file wasn't completely written,
so the persister is pretty angry.  So, I have a few questions :)
Should rabbit be robust to power failures?  i.e, is the persister a
durable file structure, or is it fragile to being killed when writing?
 Also, is there a better way to recover than by wiping out the files
in the rabbit queue dir and re-initializing the queues?

I also have the erl_crash.dump if this is a bug that could be fixed
with a crash.dump.




More information about the rabbitmq-discuss mailing list