[rabbitmq-discuss] Recovering from power failure
tsuraan
tsuraan at gmail.com
Mon Aug 3 18:08:41 BST 2009
I have a machine where rabbit (1.6.0) died brutally due to a power
failure, and now it can't start anymore. It gets to starting
persister, and then we get:
starting persister ...Erlang has closed
{"init terminating in
do_boot",{{nocatch,{error,{cannot_start_application,rabbit,{bad_return,{{rabbit,start,[normal,[]]},{'EXIT',{{badmatch,{error,{{{badmatch,eof},[{rabbit_persister,internal_load_snapshot,2},{rabbit_persister,init,1},{gen_server,init_it,6},{proc_lib,init_p,5}]},{child,undefined,rabbit_persister,{rabbit_persister,start_link,[]},transient,100,worker,[rabbit_persister]}}}},[{rabbit,start_child,1},{rabbit,'-start/2-fun-4-',0},{rabbit,'-start/2-fun-0-',1},{lists,foreach,2},{rabbit,start,2},{application_master,start_it_old,4}]}}}}}}},[{init,start_it,1},{init,start_em,1}]}}
>From the eof, I'm guessing that some file wasn't completely written,
so the persister is pretty angry. So, I have a few questions :)
Should rabbit be robust to power failures? i.e, is the persister a
durable file structure, or is it fragile to being killed when writing?
Also, is there a better way to recover than by wiping out the files
in the rabbit queue dir and re-initializing the queues?
I also have the erl_crash.dump if this is a bug that could be fixed
with a crash.dump.
More information about the rabbitmq-discuss
mailing list