[rabbitmq-discuss] 2.4.1 broker failure/crash
Mark Geib
mark.geib.44 at gmail.com
Wed Apr 18 21:12:22 BST 2012
We are running rabbitmq 2.4.1 in production and recently had a failure that
we can not determine the root cause. Also we tried a restart of the broker
and the restart hung, never returned. We rebooted the machine to restore
the broker.
We have only the rabbitmq and sasl logs at this point, but the error
messages don't mean much to us.
rabbitmq log snippet:
=INFO REPORT==== 11-Apr-2012::05:04:08 ===
starting TCP connection <0.28490.65> from 172.17.208.67:1522
=INFO REPORT==== 11-Apr-2012::05:04:08 ===
closing TCP connection <0.9195.65> from 10.70.20.75:62045
=INFO REPORT==== 11-Apr-2012::05:04:31 ===
closing TCP connection <0.10243.65> from 10.70.40.77:53173
=ERROR REPORT==== 11-Apr-2012::05:04:31 ===
** Generic server msg_store_transient terminating
** Last message in was {'$gen_cast',
{client_dying,
<<74,18,61,37,8,55,8,91,210,27,70,185,112,89,
171,154>>}}
** When Server state == {msstate,
"/var/lib/rabbitmq/mnesia/rabbit at che-csebrokerp1/msg_store_transient",
rabbit_msg_store_ets_index,
{state,417861,
"/var/lib/rabbitmq/mnesia/rabbit at che-csebrokerp1/msg_store_transient"},
0,#Ref<0.0.0.875>,
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},
[],undefined,0,12073198,[],<0.233.0>,421958,413764,
426055,
{set,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},
...skipping...
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]}}}}
** Reason for termination ==
** {{badmatch,false},
[{rabbit_msg_store_ets_index,insert,2},
{rabbit_msg_store,write_message,3},
{rabbit_msg_store,handle_cast,2},
{gen_server2,handle_msg,2},
{proc_lib,wake_up,3}]}
...skipping...
=INFO REPORT==== 11-Apr-2012::05:04:43 ===
closing TCP connection <0.5032.4496> from 172.16.216.217:60234
=INFO REPORT==== 11-Apr-2012::05:04:43 ===
closing TCP connection <0.8419.6115> from 10.65.10.72:54580
=ERROR REPORT==== 11-Apr-2012::05:04:43 ===
** Generic server <0.31907.9> terminating
** Last message in was {'EXIT',<0.241.0>,shutdown}
** When Server state == {q,
{amqqueue,
{resource,<<"/alarming">>,queue,<<"alarming.9">>},
false,false,none,[],<0.31907.9>},
none,true,rabbit_variable_queue,
{vqstate,
{[],[]},
{0,{[],[]}},
{delta,undefined,0,undefined},
...skipping...
{state,fine,undefined},
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]}}},
undefined,undefined}
** Reason for termination ==
** {noproc,
{gen_server2,call,
[msg_store_transient,
{client_terminate,
<<17,102,9,148,6,184,165,141,162,246,194,57,36,62,208,135>>},
infinity]}}
** In 'terminate' callback with reason ==
** shutdown
=ERROR REPORT==== 11-Apr-2012::05:04:43 ===
** gen_event handler rabbit_error_logger crashed.
** Was installed in error_logger
** Last event was: {error,<0.146.0>,
{<0.9700.6>,
"** Generic server ~p terminating~n** Last message in
was ~p~n** When Server state == ~p~n** Reason for termination == ~n**
~p~n** In 'terminate' callback with reason ==~n** ~p~n",
[<0.9700.6>,
{'EXIT',<0.241.0>,shutdown},
{q,
{amqqueue,
{resource,<<"/rssm">>,queue,
<<"cse.rssm.logManager.sqlserver">>},
false,false,none,[],<0.9700.6>},
none,true,rabbit_variable_queue,
{vqstate,
{[],[]},
{0,{[],[]}},
{delta,undefined,0,undefined},
{0,{[],[]}},
...skipping...
{noproc,
{gen_server2,call,
[msg_store_transient,
{client_terminate,
<<143,174,238,76,144,209,125,211,110,123,56,1,237,
217,136,2>>},
infinity]}},
shutdown]}}
** When handler state == {resource,<<"/">>,exchange,<<"amq.rabbitmq.log">>}
** Reason == {badarg,[{ets,lookup,[rabbit_registry,{exchange,topic}]},
{rabbit_registry,lookup_module,2},
{rabbit_exchange,type_to_module,1},
{rabbit_exchange,route,2},
{rabbit_exchange,publish,2},
{rabbit_basic,publish,1},
{rabbit_error_logger,publish1,4},
{rabbit_error_logger,handle_event,2}]}
=INFO REPORT==== 11-Apr-2012::05:04:43 ===
application: rabbit
exited: shutdown
type: permanent
sasl log snippet:
=SUPERVISOR REPORT==== 11-Apr-2012::00:15:30 ===
Supervisor: {<0.5419.34>,rabbit_channel_sup_sup}
Context: shutdown_error
Reason: shutdown
Offender: [{pid,<0.5731.34>},
{name,channel_sup},
{mfa,{rabbit_channel_sup,start_link,[]}},
{restart_type,temporary},
{shutdown,infinity},
{child_type,supervisor}]
=CRASH REPORT==== 11-Apr-2012::05:04:32 ===
crasher:
initial call: gen:init_it/7
pid: <0.232.0>
registered_name: msg_store_transient
exception exit: {{badmatch,false},
[{rabbit_msg_store_ets_index,insert,2},
{rabbit_msg_store,write_message,3},
{rabbit_msg_store,handle_cast,2},
{gen_server2,handle_msg,2},
{proc_lib,wake_up,3}]}
in function gen_server2:terminate/3
ancestors: [rabbit_sup,<0.147.0>]
messages: [{'EXIT',<0.233.0>,normal}]
links: [<0.148.0>]
dictionary: [{fhc_age_tree,{0,nil}}]
trap_exit: true
status: running
heap_size: 10946
stack_size: 24
reductions: 98380626
neighbours:
=SUPERVISOR REPORT==== 11-Apr-2012::05:04:32 ===
Supervisor: {local,rabbit_sup}
Context: child_terminated
Reason: {{badmatch,false},
[{rabbit_msg_store_ets_index,insert,2},
{rabbit_msg_store,write_message,3},
{rabbit_msg_store,handle_cast,2},
{gen_server2,handle_msg,2},
{proc_lib,wake_up,3}]}
Offender: [{pid,<0.232.0>},
{name,msg_store_transient},
{mfargs,
{rabbit_msg_store,start_link,
[msg_store_transient,
"/var/lib/rabbitmq/mnesia/rabbit at che-csebrokerp1",
undefined,
{#Fun<rabbit_variable_queue.0.66952436>,ok}]}},
{restart_type,transient},
{shutdown,4294967295},
{child_type,worker}]
=SUPERVISOR REPORT==== 11-Apr-2012::05:04:32 ===
Supervisor: {local,rabbit_sup}
Context: shutdown
Reason: reached_max_restart_intensity
Offender: [{pid,<0.232.0>},
{name,msg_store_transient},
{mfargs,
{rabbit_msg_store,start_link,
[msg_store_transient,
"/var/lib/rabbitmq/mnesia/rabbit at che-csebrokerp1",
undefined,
{#Fun<rabbit_variable_queue.0.66952436>,ok}]}},
{restart_type,transient},
{shutdown,4294967295},
{child_type,worker}]
...skipping...
=CRASH REPORT==== 11-Apr-2012::05:04:43 ===
crasher:
initial call: gen:init_it/6
pid: <0.31907.9>
registered_name: []
exception exit: {noproc,
{gen_server2,call,
[msg_store_transient,
{client_terminate,
<<213,104,174,241,176,121,164,159,98,43,221,
160,120,109,6,107>>},
infinity]}}
in function gen_server2:terminate/3
ancestors: [rabbit_amqqueue_sup,rabbit_sup,<0.147.0>]
messages: []
links: []
dictionary: [{guid,{{9,<0.31907.9>},0}}]
trap_exit: true
status: running
heap_size: 987
stack_size: 24
reductions: 443158598
neighbours:
=SUPERVISOR REPORT==== 11-Apr-2012::05:04:43 ===
Supervisor: {local,rabbit_amqqueue_sup}
Context: shutdown_error
Reason: {noproc,
{gen_server2,call,
[msg_store_transient,
{client_terminate,
<<213,104,174,241,176,121,164,159,98,43,221,160,
120,109,6,107>>},
infinity]}}
Offender: [{pid,<0.31907.9>},
{name,rabbit_amqqueue},
{mfa,{rabbit_amqqueue_process,start_link,[]}},
{restart_type,temporary},
{shutdown,4294967295},
{child_type,worker}]
Any help determining the cause would be appreciated.
Mark.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120418/c471ea9b/attachment.htm>
More information about the rabbitmq-discuss
mailing list