[rabbitmq-discuss] RabbitMQ crash report
carlhoerberg
carl.hoerberg at gmail.com
Wed Sep 25 09:21:12 BST 2013
With a two node RabbitMQ 3.1.5 cluster, Erlang R16B01:
Suddenly all vhost delete operations times out (or never finishes) via both
the http api and rabbitmqctl. Later queue delete operation stops, and
eventually creating new vhosts/users.
Restarting just one node (and yeah, as usual the node can't be stopped
normally but have to be killed), and when bringing it back up again it
endlessly throws:
Discarding message {'$gen_call',{<0.11812.0>,#Ref<0.0.0.180597>},stat} from
<0.11812.0> to <0.684.0> in an old incarnation (2) of this node (3)
A full cluster restart was required. But then "msg_store_persistent:
rebuilding indices from scratch" takes about ten minutes.
Have a big log dump with a lot of juice error messages in if you want to
take a look. Some examples:
=CRASH REPORT==== 25-Sep-2013::07:40:26 ===
crasher:
initial call: gen:init_it/6
pid: <0.1142.0>
registered_name: []
exception exit: {bad_return_value,
{error,
{{badmatch,[]},
[{rabbit_mirror_queue_master,
'-init_with_existing_bq/3-fun-0-',3,[]},
{mnesia_tm,apply_fun,3,
[{file,"mnesia_tm.erl"},{line,830}]},
{mnesia_tm,execute_transaction,5,
[{file,"mnesia_tm.erl"},{line,810}]},
{rabbit_misc,
'-execute_mnesia_transaction/1-fun-0-',1,[]},
{worker_pool_worker,handle_call,3,[]},
{gen_server2,handle_msg,2,[]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,239}]}]}}}
in function gen_server2:terminate/3
ancestors: [rabbit_amqqueue_sup,rabbit_sup,<0.125.0>]
messages: []
links: [<0.247.0>,<0.1745.0>,#Port<0.22581>]
dictionary: [{{#Ref<0.0.0.32922>,fhc_handle},
{handle,{file_descriptor,prim_file,{#Port<0.22581>,853}},
0,false,0,infinity,[],true,
"/var/lib/rabbitmq/mnesia/rabbit at turtle-01/queues/5U4MQIEI1ZAQX0119BWIV8SP0/journal.jif",
[write,binary,raw,read],
[{write_buffer,infinity}],
true,true,
{1380,94826,345789}}},
{{"/var/lib/rabbitmq/mnesia/rabbit at turtle-01/queues/5U4MQIEI1ZAQX0119BWIV8SP0/journal.jif",
fhc_file},
{file,1,true}},
{fhc_age_tree,{1,
{{1380,94826,345789},
#Ref<0.0.0.32922>,nil,nil}}},
{guid,{{4087053537,1505430215,3464040155,2830024215},1}}]
trap_exit: true
status: running
heap_size: 2586
stack_size: 27
reductions: 3923
neighbours:
neighbour: [{pid,<0.1746.0>},
{registered_name,[]},
{initial_call,{gen,init_it,
['Argument__1','Argument__2',
'Argument__3','Argument__4',
'Argument__5','Argument__6']}},
{current_function,{gen_server2,process_next_msg,1}},
{ancestors,[<0.1745.0>,<0.1142.0>,rabbit_amqqueue_sup,
rabbit_sup,<0.125.0>]},
{messages,[]},
{links,[<0.1745.0>]},
{dictionary,[{random_seed,{1381,3909,13080}}]},
{trap_exit,false},
{status,waiting},
{heap_size,610},
{stack_size,7},
{reductions,213}]
neighbour: [{pid,<0.1745.0>},
{registered_name,[]},
{initial_call,{gen,init_it,
['Argument__1','Argument__2',
'Argument__3','Argument__4',
'Argument__5','Argument__6']}},
{current_function,{gen_server2,process_next_msg,1}},
{ancestors,[<0.1142.0>,rabbit_amqqueue_sup,rabbit_sup,
<0.125.0>]},
{messages,[]},
{links,[<0.1142.0>,<0.1746.0>]},
{dictionary,[]},
{trap_exit,false},
{status,waiting},
{heap_size,233},
{stack_size,7},
{reductions,104}]
=SUPERVISOR REPORT==== 25-Sep-2013::07:40:26 ===
Supervisor: {local,rabbit_amqqueue_sup}
Context: child_terminated
Reason: {bad_return_value,
{error,
{{badmatch,[]},
[{rabbit_mirror_queue_master,
'-init_with_existing_bq/3-fun-0-',3,[]},
{mnesia_tm,apply_fun,3,
[{file,"mnesia_tm.erl"},{line,830}]},
{mnesia_tm,execute_transaction,5,
[{file,"mnesia_tm.erl"},{line,810}]},
{rabbit_misc,
'-execute_mnesia_transaction/1-fun-0-',1,[]},
{worker_pool_worker,handle_call,3,[]},
{gen_server2,handle_msg,2,[]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,239}]}]}}}
Offender: [{pid,<0.1142.0>},
{name,rabbit_amqqueue},
{mfa,
{rabbit_amqqueue_process,start_link,
[{amqqueue,
{resource,<<"bizmvclr">>,queue,
<<"tmp_topic-0.2714721148367971">>},
true,true,<0.9362.162>,[],<0.2758.162>,[],[],
[{vhost,<<"bizmvclr">>},
{name,<<"HA">>},
{pattern,<<"^(?!amq\\.).*">>},
{definition,[{<<"ha-mode">>,<<"all">>}]},
{priority,0}],
[]}]}},
{restart_type,temporary},
{shutdown,4294967295},
{child_type,worker}]
And:
=SUPERVISOR REPORT==== 25-Sep-2013::07:08:10 ===
Supervisor: {<0.4774.372>,rabbit_connection_sup}
Context: shutdown_error
Reason: channel_termination_timeout
Offender: [{pid,<0.22496.371>},
{name,reader},
{mfa,{rabbit_reader,start_link,
[<0.2222.372>,<0.28585.371>,
#Fun<rabbit_heartbeat.2.69784259>]}},
{restart_type,intrinsic},
{shutdown,4294967295},
{child_type,worker}]
--
View this message in context: http://rabbitmq.1065348.n5.nabble.com/RabbitMQ-crash-report-tp29893.html
Sent from the RabbitMQ mailing list archive at Nabble.com.
More information about the rabbitmq-discuss
mailing list