[rabbitmq-discuss] Rabbitmq crashed (2.8.7 version)

sagu prf sagu.prf1 at gmail.com
Tue Feb 4 16:22:36 GMT 2014


Hello team ,




         We are running rabbitmq 2.8.7 ,  yesterday morning we had
network outage detected on one of the cluster node (which is master
during that network event).  it did not failover/recover after network
outage .

Management console shows queue status is unknown.

Cluster status :

rabbitmq04 - shows all cluster node are active
rabbitmq05 - shows only 05 and 06 are up ( it  was considered node 4 is down)
rabbitmq06  - shows only 05 and 06 are up ( it  was considered node 4 is down)

To resolve this issue:

restarted rabbitmq services cluster node one by one  but i noticed
queue missing after restart .



rabbitmq06 , rabbitmq05 restarted these machines and kept online rabbitmq04.

after came up mq06 and mq05 .. restarted the rabbitmq04 .

I have uploaded the backup config file to fix the queue missing issue
and rabbitmq was up :


After 2 hours , Rabbitmq is unresponsive (even command line
list_queues  were not working on rabbitmq05 ,rabbitmq06 )

i can see list of queues on only one node (rabbitmq04) .


Sasl logs :
---------------

=CRASH REPORT==== 3-Feb-2014::18:59:36 ===

  crasher:

    initial call: gen:init_it/6

    pid: <0.286.0>

    registered_name: []

    exception exit: {normal,

                        {gen_server2,call,

                            [<0.285.0>,{gm_deaths,[<3201.298.0>]},infinity]}}

      in function  gen_server2:terminate/3

rabbitmq at nc1-iphonesys-mq05:/var/log/rabbitmq$ head -1000
rabbit at nc1-iphonesys-mq05-sasl.log.1


=CRASH REPORT==== 3-Feb-2014::18:59:36 ===

  crasher:

    initial call: gen:init_it/6

    pid: <0.286.0>

    registered_name: []

    exception exit: {normal,

                        {gen_server2,call,

                            [<0.285.0>,{gm_deaths,[<3201.298.0>]},infinity]}}

      in function  gen_server2:terminate/3

    ancestors: [<0.285.0>,rabbit_mirror_queue_slave_sup,rabbit_sup,

                  <0.120.0>]

    messages: []

    links: []

    dictionary: [{random_seed,{27359,22849,24486}}]

    trap_exit: false

    status: running

    heap_size: 1597

    stack_size: 24

    reductions: 9950

  neighbours:


=SUPERVISOR REPORT==== 3-Feb-2014::19:15:55 ===

     Supervisor: {<0.5400.9>,rabbit_channel_sup_sup}

     Context:    shutdown_error

     Reason:     noproc

     Offender:   [{pid,<0.5424.9>},

                  {name,channel_sup},

                  {mfa,{rabbit_channel_sup,start_link,[]}},

                  {restart_type,temporary},

                  {shutdown,infinity},

                  {child_type,supervisor}]



=CRASH REPORT==== 3-Feb-2014::20:19:20 ===

  crasher:

    initial call: rabbit_reader:init/4

    pid: <0.1467.9>

    registered_name: []

    exception exit: channel_termination_timeout

      in function  rabbit_reader:wait_for_channel_termination/2

      in call from rabbit_reader:send_exception/3

      in call from rabbit_reader:terminate/2

      in call from rabbit_reader:handle_other/3

      in call from rabbit_reader:start_connection/7

    ancestors: [<0.1464.9>,rabbit_tcp_client_sup,rabbit_sup,<0.120.0>]

    messages: [emit_stats,

                  {tcp,#Port<0.51699>,<<8,0,0,0,0,0,0,206>>},

                  {'EXIT',#Port<0.51699>,normal}]

    links: []

    dictionary: [{{ch_pid,<0.1515.9>},{1,#Ref<0.0.30.173849>}},

                  {{channel,1},

                   {<0.1515.9>,{method,rabbit_framing_amqp_0_9_1}}}]

    trap_exit: true

    status: running

    heap_size: 2584

    stack_size: 24

    reductions: 141714

  neighbours:


=SUPERVISOR REPORT==== 3-Feb-2014::20:19:20 ===

     Supervisor: {<0.1464.9>,rabbit_connection_sup}

     Context:    shutdown_error

     Reason:     channel_termination_timeout

     Offender:   [{pid,<0.1467.9>},

                  {name,reader},

                  {mfa,{rabbit_reader,start_link,

                                      [<0.1466.9>,<0.1465.9>,

                                       #Fun<rabbit_heartbeat.2.20850803>]}},

                  {restart_type,intrinsic},

                  {shutdown,4294967295},

                  {child_type,worker}]



=CRASH REPORT==== 3-Feb-2014::20:19:20 ===

  crasher:

    initial call: rabbit_reader:init/4

    pid: <0.2780.9>

    registered_name: []

    exception exit: channel_termination_timeout

      in function  rabbit_reader:wait_for_channel_termination/2

      in call from rabbit_reader:send_exception/3

      in call from rabbit_reader:terminate/2

      in call from rabbit_reader:handle_other/3

      in call from rabbit_reader:start_connection/7

    ancestors: [<0.2777.9>,rabbit_tcp_client_sup,rabbit_sup,<0.120.0>]

    messages: [{'EXIT',#Port<0.52321>,normal}]

    links: []

    dictionary: [{{channel,1},

                   {<0.2786.9>,{method,rabbit_framing_amqp_0_9_1}}},

                  {{ch_pid,<0.2786.9>},{1,#Ref<0.0.31.26199>}}]

    trap_exit: true

    status: running

    heap_size: 2584

    stack_size: 24

    reductions: 119601

  neighbours:


Regards
sagu


More information about the rabbitmq-discuss mailing list