[rabbitmq-discuss] RabbitMQ crash report

carlhoerberg carl.hoerberg at gmail.com
Wed Sep 25 09:21:12 BST 2013


With a two node RabbitMQ 3.1.5 cluster, Erlang R16B01:

Suddenly all vhost delete operations times out (or never finishes) via both
the http api and rabbitmqctl. Later queue delete operation stops, and
eventually creating new vhosts/users. 

Restarting just one node (and yeah, as usual the node can't be stopped
normally but have to be killed), and when bringing it back up again it
endlessly throws: 

Discarding message {'$gen_call',{<0.11812.0>,#Ref<0.0.0.180597>},stat} from
<0.11812.0> to <0.684.0> in an old incarnation (2) of this node (3)

A full cluster restart was required. But then "msg_store_persistent:
rebuilding indices from scratch" takes about ten minutes. 

Have a big log dump with a lot of juice error messages in if you want to
take a look. Some examples:

=CRASH REPORT==== 25-Sep-2013::07:40:26 ===
  crasher:
    initial call: gen:init_it/6
    pid: <0.1142.0>
    registered_name: []
    exception exit: {bad_return_value,
                        {error,
                            {{badmatch,[]},
                             [{rabbit_mirror_queue_master,
                                  '-init_with_existing_bq/3-fun-0-',3,[]},
                              {mnesia_tm,apply_fun,3,
                                  [{file,"mnesia_tm.erl"},{line,830}]},
                              {mnesia_tm,execute_transaction,5,
                                  [{file,"mnesia_tm.erl"},{line,810}]},
                              {rabbit_misc,
                                 
'-execute_mnesia_transaction/1-fun-0-',1,[]},
                              {worker_pool_worker,handle_call,3,[]},
                              {gen_server2,handle_msg,2,[]},
                              {proc_lib,init_p_do_apply,3,
                                  [{file,"proc_lib.erl"},{line,239}]}]}}}
      in function  gen_server2:terminate/3 
    ancestors: [rabbit_amqqueue_sup,rabbit_sup,<0.125.0>]
    messages: []
    links: [<0.247.0>,<0.1745.0>,#Port<0.22581>]
    dictionary: [{{#Ref<0.0.0.32922>,fhc_handle},
                   {handle,{file_descriptor,prim_file,{#Port<0.22581>,853}},
                           0,false,0,infinity,[],true,
                          
"/var/lib/rabbitmq/mnesia/rabbit at turtle-01/queues/5U4MQIEI1ZAQX0119BWIV8SP0/journal.jif",
                           [write,binary,raw,read],
                           [{write_buffer,infinity}],
                           true,true,
                           {1380,94826,345789}}},
                 
{{"/var/lib/rabbitmq/mnesia/rabbit at turtle-01/queues/5U4MQIEI1ZAQX0119BWIV8SP0/journal.jif",
                    fhc_file},
                   {file,1,true}},
                  {fhc_age_tree,{1,
                                 {{1380,94826,345789},
                                  #Ref<0.0.0.32922>,nil,nil}}},
                  {guid,{{4087053537,1505430215,3464040155,2830024215},1}}]
    trap_exit: true
    status: running
    heap_size: 2586
    stack_size: 27
    reductions: 3923
  neighbours:
    neighbour: [{pid,<0.1746.0>},
                  {registered_name,[]},
                  {initial_call,{gen,init_it,
                                     ['Argument__1','Argument__2',
                                      'Argument__3','Argument__4',
                                      'Argument__5','Argument__6']}},
                  {current_function,{gen_server2,process_next_msg,1}},
                  {ancestors,[<0.1745.0>,<0.1142.0>,rabbit_amqqueue_sup,
                              rabbit_sup,<0.125.0>]},
                  {messages,[]},
                  {links,[<0.1745.0>]},
                  {dictionary,[{random_seed,{1381,3909,13080}}]},
                  {trap_exit,false},
                  {status,waiting},
                  {heap_size,610},
                  {stack_size,7},
                  {reductions,213}]
    neighbour: [{pid,<0.1745.0>},
                  {registered_name,[]},
                  {initial_call,{gen,init_it,
                                     ['Argument__1','Argument__2',
                                      'Argument__3','Argument__4',
                                      'Argument__5','Argument__6']}},
                  {current_function,{gen_server2,process_next_msg,1}},
                  {ancestors,[<0.1142.0>,rabbit_amqqueue_sup,rabbit_sup,
                              <0.125.0>]},
                  {messages,[]},
                  {links,[<0.1142.0>,<0.1746.0>]},
                  {dictionary,[]},
                  {trap_exit,false},
                  {status,waiting},
                  {heap_size,233},
                  {stack_size,7},
                  {reductions,104}]

=SUPERVISOR REPORT==== 25-Sep-2013::07:40:26 ===
     Supervisor: {local,rabbit_amqqueue_sup}
     Context:    child_terminated
     Reason:     {bad_return_value,
                     {error,
                         {{badmatch,[]},
                          [{rabbit_mirror_queue_master,
                               '-init_with_existing_bq/3-fun-0-',3,[]},
                           {mnesia_tm,apply_fun,3,
                               [{file,"mnesia_tm.erl"},{line,830}]},
                           {mnesia_tm,execute_transaction,5,
                               [{file,"mnesia_tm.erl"},{line,810}]},
                           {rabbit_misc,
                               '-execute_mnesia_transaction/1-fun-0-',1,[]},
                           {worker_pool_worker,handle_call,3,[]},
                           {gen_server2,handle_msg,2,[]},
                           {proc_lib,init_p_do_apply,3,
                               [{file,"proc_lib.erl"},{line,239}]}]}}}
     Offender:   [{pid,<0.1142.0>},
                  {name,rabbit_amqqueue},
                  {mfa,
                      {rabbit_amqqueue_process,start_link,
                          [{amqqueue,
                               {resource,<<"bizmvclr">>,queue,
                                   <<"tmp_topic-0.2714721148367971">>},
                               true,true,<0.9362.162>,[],<0.2758.162>,[],[],
                               [{vhost,<<"bizmvclr">>},
                                {name,<<"HA">>},
                                {pattern,<<"^(?!amq\\.).*">>},
                                {definition,[{<<"ha-mode">>,<<"all">>}]},
                                {priority,0}],
                               []}]}},
                  {restart_type,temporary},
                  {shutdown,4294967295},
                  {child_type,worker}]

And:

=SUPERVISOR REPORT==== 25-Sep-2013::07:08:10 ===
     Supervisor: {<0.4774.372>,rabbit_connection_sup}
     Context:    shutdown_error
     Reason:     channel_termination_timeout
     Offender:   [{pid,<0.22496.371>},
                  {name,reader},
                  {mfa,{rabbit_reader,start_link,
                                      [<0.2222.372>,<0.28585.371>,
                                       #Fun<rabbit_heartbeat.2.69784259>]}},
                  {restart_type,intrinsic},
                  {shutdown,4294967295},
                  {child_type,worker}]





--
View this message in context: http://rabbitmq.1065348.n5.nabble.com/RabbitMQ-crash-report-tp29893.html
Sent from the RabbitMQ mailing list archive at Nabble.com.


More information about the rabbitmq-discuss mailing list